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Introduction  and  Purpose 

Although  expenditures  by  not-for-profit  (NFP)  organizations 
account  for  over  one-third  of  the  gross  national  product  of  the 

r 

United  States,  the  accounting  systems  for  such  organizations  provide 
little  more  than  budgetary  control  over  expenditures.  Furthermore, 
despite  recognition  of  the  need  for  performance  measurement,  oper- 
ational performance  measurement  systems  are  virtually  nonexistent. 

The  purpose  of  this  research  was  to  develop  and  test  a  perform- 
ance measurement  model  for  a  NFP  organization.     The  research 
proceeded  in  three  phases.     First,  the  present  state  of  performance 
measurement  in  the  NFP  area  was  assessed.     Second,  the  model  was 
developed.     Finally,  the  model  was  operationalized  by  collecting 
and  analyzing  the  data  required  by  the  model. 

xi 


Model  Development  and  Test 

The  model  was  developed  for  the  Gainesville  Recreation 
Department  (GRD) ,  Gainesville,  Florida.     For  the  purpose  of  model 
development,  the  GRD  was  viewed  as  a  social  system  whose  purpose  is 
to  promote  the  welfare  of  the  Gainesville  community  through  the  pro- 
vision of  recreation  programs  and  facilities.     Ideally,  the  amount 
and  type  of  public  recreation  provided  would  be  the  amount  and  type 
consistent  with  the  maximum  total  welfare  for  the  Gainesville  com- 
munity .     Through  knowledge  of  the  social  welfare  function  for 
Gainesville,  decision  makers  establish  the  optimal  recreation  budget, 
acquire  program  inputs  and  transform  the  inputs  via  production 
functions  into  those  outputs  consistent  with  maximum  welfare. 

Because  the  social  welfare  function  exceeds  current  knowledge, 
operationalization  of  the  model  was  limited  to  the  development  and 
validation  of  measures  of  program  output  and  input.     With  the 
assistance  of  GRD  supervisors,  major  recreation  programs  were  identified 
For  most  of  these  programs,  the  GRD  was  able  to  provide  estimates  of 
the  following  objective  inputs  and  outputs:     (1)  direct  costs,  (2) 
labor  hours  of  input,   (3)  participant  and  spectator  hours  (quantity 
of  output)  and  (4)  user  fees.    As  measures  of  output  quantity,  by 
themselves,  provide  insufficient  evidence  of  how  well  the  community 
is  served  by  a  program,  measures  of  program  importance  and  quality 
were  also  obtained.     The  measures  of  program  importance  and  quality 
were  produced  from  the  opinions  of  GRD  supervisors.  Public  Recreation 
Advisory  Board  members  and  a  sample  of  residents  in  the  Gainesville 

xii 


community.  The  validity  of  these  subjective  measures  was  evaluated 
by  use  of  Delphi  procedures  and  multitrait-multimethod  methodology. 

Conclusions 

Based  on  Delphi  criteria,  valid  group  judgements  of  program 
importance  and  quality  were  found  to  exist.     Based  on  multitrait- 
multimethod  criteria,  the  measures  of  program  importance  and  quality 
are  believed  to  be  valid.     The  success  achieved  in  measuring  and  val- 
idating the  importance  and  quality  of  recreation  programs  suggests 
that  subjective  measures  are  well  within  the  purview  of  current 
methodology — to  the  extent  such  measures  are  found  useful  to  NFP  decision 
makers,  they  can  and  should  be  provided.    While  a  lack  of  research 
resources  precluded  the  measurement  of  objective  inputs  and  outputs 
as  well  as  desired,  the  generation  process  itself  has  demonstrated  the 
potential  for  formally  collecting  such  information. 

The  successful  collection  and  validation  of  the  data  specified 
by  the  model  indicates  that  the  model  can  be  operationalized.  Its 
implementation  would  provide  quantitative  information  to  decision 
makers  which  should  be  useful  in  assessing  the  contribution  of  individ- 
ual recreation  programs  to  the  welfare  of  the  Gainesville  community 
and  in  evaluating  the  impact  of  budget  changes  on  outputs  and  inputs. 
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CHAPTER  I 
INTRODUCTION 
Purpose  of  Research 
The  general  purpose  of  this  research  is  to  provide  evidence  as 
to  the  feasibility  and  efficacy  of  performance  measurement  in  the 
not-for-profit  (NFP)  area.    The  researcher  hopes  to  accomplish  this 
through  the  design,  development  and  testing  of  a  performance  measure- 
ment model  for  a  NFP  organization.     Procedurally  this  will  involve  (1) 
a  review  and  assessment  of  existing  methodologies  for  NFP  performance 
measurement,   (2)  the  design  and  development  of  a  model  for  measuring 
the  performance  of  the  Gainesville  Recreation  Department  (CRD)  and 
(3)  the  operationalization  of  this  model. 

Contribution  of  Research 

Because  little  theory  or  prior  research  esists  to  guide  this 

research  project,  it  must  be  viewed  as  exploratory.     This  fact  does 

not  render  it  any  less  valuable,  however,  for  as  Bauer  has  noted 

...it  is  recognized  that  at  the  present  state  of  the  art, 
the  first  efforts  to  collect  new  kinds  of  data  will  be 
seriously  defective.     Here  the  conclusion  seems  to  be  that 
rather  than  do  nothing  it  is  preferable  to  start  out  with 
bad  data,  warn  everyone  about  the  defects  and  limitations 
and  aim  at  gradual  "improvement  through  use."  [emphasis 
added]   (1967,  p.  .vi) 

Thus  while  definitive  conclusions  should  not  be  expected,  the  following 

contributions  to  knowledge  are  anticipated: 

1.    Although  concern  for  performance  measurement  in  the  NFP  area 
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is  evident,  researchers  are  handicapped  by  the  fact  that  the  present 
state  of  this  type  of  measurement  is  not  vjell  known.     In  this  re- 
search, methodologies  which  have  been  proposed  for  dealing  with 
performance  measurement  in  the  NFP  area  will  be  reviewed  and  eval- 
uated.    The  extent  of  their  applications  and  accomplishments  will 
be  discussed.     This  information  should  be  of  value  to  others  who 
desire  to  do  research  in  the  area. 

2.  Although  the  need  for  performance  measurement  is  well 
recognized,  few  attempts  have  been  made  to  develop  formal  models 

for  assessing  the  performance  of  NFP  organizations.     In  this  research 
a  performance  measurement  model  will  be  developed  for  the  GRD.  The 
model  development  and  test  should  provide  valuable  insight  into  the 
problems  and  benefits  of  a  formal  evaluation  system.     The  model  will 
hopefully  serve  as  a  guide  and  catalyst  for  the  future  development 
of  performance  measurement  models. 

3.  Evaluation  of  the  performance  of  recreation  departments 
has  been  limited  primarily  to  self-evaluation  studies  (e.g.,  Smissen, 
1972).     Because  the  objectivity  of  such  evaluations  is  questionable, 
this  research  utilizes  both  self  and  outside  evaluations.     By  comparing 
these  evaluations,  conclusions  as  to  the  validity  (and  therefore  the 
usefulness)  of  the  self-evaluation  study  should  be  possible. 

4.  Helmer  (1966)  and  Dalkey  (1969)  advocate  the  use  of  expert 
judgement  via  the  Delphi  technique  as  a  means  of  dealing  with  areas 
in  which  exact  knowledge  is  not  available.     Dalkey  has  stated  that 
there  are  two  options  available  when  one  is  working  on  a  problem 
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under  conditions  of  uncertainty  with  insufficient  data,  incomplete 

theory,  and  a  high  order  of  complexity: 

. . .we  can  either  wait  indefinitely  until  we  have  an 
adequate  theory  enabling  us  to  deal  with  socio-metric 
and  political  problems  as  confidently  as  we  do  with 
problems  in  physics  and  chemistry,  or  we  can  make  the 
most  of  an  admittedly  unsatisfactory  situation  and  try 
to  obtain  the  relevant  intuitive  insights  of  experts 
and  then  use  their  judgements  as  systematically  [via 
Delphi]  as  possible.     (quoted  in  Pill,  1971,  p.  61) 

In  the  development  of  a  social  service  measurement  model  for  the 

Cleveland  Jewish  Community  Federation  (Mantel  et  al.  ,  1972)  ,  Delphi 

was  found  to  be  a  useful  means  of  producing  measures  of  service 

quality  and  value.     Its  use  in  this  research  to  generate  measures 

of  the  importance  and  quality  of  recreation  programs  should  provide 

additional  evidence  as  to  its  usefulness  in  developing  performance 

measures. 

5.     The  multitrait-multimethod  methodology  (Campbell  and  Fiske, 
1959)  is  a  powerful  tool  for  assessing  the  validity  of  constructs 
incapable  of  exact  measurement.    While  this  methodology  has  been 
applied  in  the  area  of  managerial  performance  measurement  (Lawler, 
1967)  ,  its  use  in  this  research  represents  its  first  known  application 
for  the  purpose  of  evaluating  measures  to  be  used  in  assessing  the 
performance  of  a  NFP  organization.     This  methodology  will  hopefully 
provide  a  means  of  objectively  appraising  subjective  measures. 

Scope  of  Research 
The  literature  reviewed  and  evaluated  in  this  research  deals 
primarily  with  NFP  organizations  in  the  United  States,     The  design, 
development  and  testing  of  the  performance  measurement  model  was  done 
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with  the  cooperation  and  assistance  of  the  Recreation  Department  of 
the  City  of  Gainesville,  Florida. 

Nature  of  the  Not-For-Prof it  Area 

The  fundamental  distinction  between  profit  and  NFP  organizations 
is  that  the  primary  goal  of  profit  organizations  is  to  maximize  profits 
whereas  a  NFP  organization  is  concerned  only    indirectly,  if  at  all, 
with  profit  maximization.     NFP  organizations,  such  as  governments, 
schools,  churches  and  hospitals,  are  primarily  concerned  with  the 
welfare  of  their  beneficiary  groups.     These  organizations  do  not  sell 
their  products  and  services  in  the  marketplace  and  therefore  the 
benefits  they  provide  can  seldom  be  measured  in  terms  of  revenue. 

Historically,  the  primary  indicant  of  the  performance  of  profit 
organizations  has  been  the  net  income  generated  for  some  past  period 
of  time.     Net  income,  a  product  of  a  firm's  accounting  system,  is  the 
result  of  applying  the  matching  concept  to  economic  events  affecting 
the  firm.     Since  revenues  are  not  normally  produced  or  sought  by  NFP 
organizations,  net  income  cannot  be  used  to  evaluate  performance  in 
the  NFP  area  (e.g.,  Knighton,  1969;  Committee  on  Concepts  of  Accounting 
Applicable  to  the  Public  Sector,  1972). 

Importance  of  Not-For-Prof it  Area 
Presently  expenditures  by  NFP  organizations  constitute  over  one- 
third  of  the  gross  national  product  of  the  United  States.  Expenditures 
by  government,  the  largest  type  of  NFP  organization,  have  increased 
from  7%  of  GNP  in  1902  to  34%  of  GNP  in  1970  (Lee  and  Johnson,  1973, 
p.  38).     On  a  per  capita  basis  this  growth  represents  an  increase 


from  $90  (in  1970  dollars)  in  1900  to  $1,638  in  1970  (Ibid.,  p.  39). 
Of  course,  government  is  not  the  only  NFP  organization  which  has 
experienced  significant  spending  increases.     For  example,  expendi- 
tures by  hospitals  increased  from  $3.7  billion  in  1950  to  $28.8 
billion  in  1971  (American  Almanac,  1974,  p.  75)  and  expenditures  by 
schools  increased  from  8.8  billion  in  1950  to  $90.2  billion  in  1973 
(Ibid.,  p.  108).     On  a  per  capita,  real  basis  (1950  expenditures 
expressed  in  1971  and  1973  dollars  respectively) ,  the  respective 
increases  were  from  $33.82  to  $139.10  (hospitals)  and  from  $92.33 
to  $428.69  (schools). 

In  view  of  the  magnitude  of  expenditures  in  the  NFP  area  and 
the  scarcity  of  resources,  it  is  imperative  that  available  resources 
be  used  efficiently  and  effectively.     The  measurement  of  performance 
has  never  appeared  to  be  more  critical. 

Lack  of  Performance  Measures  in  the  Not-For-Prof it  Area 
The  lack  of  performance  measures  in  the  NFP  area  is  well  docu- 
mented in  the  literature.     Bauer  states  that  "for  many  of  the  impor- 
tant topics  on  which  social  critics  blithely  pass  judgement,  and  on 
which  policies  are  made,  there  are  no  yardsticks  by  which  to  know 
if  things  are  getting  better  or  worse"  (1967,  p.  20).   Terleckyj  notes 
that  "in  contrast  to  a  few  exceptionally  advanced  fields  most  areas 
of  social  concern  and  public  policy  suffer  from  lack  of  even  the  most 
elementary  information,  leaving  the  field  wide  open  for  guesswork, 
emotion,  low  grade  politics  and  waste"  (1970,  p.  B-765) .  Wholey 
et  al.  write  that 


6 


The  most  impressive  finding  about  the  evaluation  of 
social  programs  in  the  federal  government  is  that  sub- 
stantial work  in  this  field  has  been  almost  nonexistent.... 
There  is  nothing  akin  to  a  comprehensive  federal  eval- 
uation system.     Even  within  agencies,  orderly  and  inte- 
grated evaluation  operations  have  not  been  established. 
(1970,  p.  15) 

The  Conmiittee  on  Not-For-Prof it  Organizations  asserted  that  "no 

'general  government'  performance  criteria,  indicators  and/or  control 

devices  analogous  to  the  net  income  measurement  of  business  type 

activities  have  been  developed  to  date"  (1974,  p.  228). 

That  the  present  state  of  affairs  is  disconcerting  to  the  highly 

educated  citizenry  characteristic  of  modern  industrial  societies  is 

evident  from  the  following  comments  by  Churchman: 

There  is  no  question  that  in  our  age  there  is  a  great 
deal  of  turmoil  about  the  manner  in  which  our  society  is 
run.     Probably  at  no  prior  point  in  the  history  of  man 
has  there  been  so  much  discussion  about  the  rights  and 
wrongs  of  the  policy  makers.... In  all  cases  the  citizen 
feels  a  perfect  right  to  have  his  say  about  the  way  in 
which  the  managers  manage .... 

Not  only  has  the  citizen  become  far  more  vocal,  but  he 
has  also  in  many  instances  begun  to  suspect  that  the 
people  who  make  the  major  decisions  that  affect  our 
lives  don't  know  what  they  are  doing.     They  don't  know 
what  they  are  doing  simply  because  they  have  no  adequate 
basis  to  judge  the  effects  of  their  decisions.   (1968,  p.  vii) 

Extending  the  Role  of  Accounting 
The  accountant  has  thus  far  limited  his  involvement  in  the  NFP 
area  to  providing  control  over  inputs — accounting  systems  at  present 
only  indicate  compliance  with  legal  and  budgetary  restrictions. 
These  systems  provide  little  information  as  to  the  accomplishments 
of  organizational  objectives  or  the  efficient  use  of  resources  (e.g., 
Henke,  1972;  Committee  on  Accounting  Practices  of  Not-For-Prof it 
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Organizations,  1971;  Conunittee  on  Concepts  of  Accounting  Applicable 
to  the  Public  Sector,  1972).     With  the  recognition  of  the  need  for 
performance  measures  in  the  NFP  area,  some  accountants  have  advo- 
cated an  extension  of  the  accountant's  role  beyond  mere  dollar 
accountability.     Among  those  advocating  an  increased  role  by  the 
accounting  profession  in  evaluating  performance  are  Bedford  (1952) , 
Churchill  and  Stedry  (1967) ,  Beyer  (1969) ,  Knighton  (1969  and  1972) , 
Mobley  (1970),  Estes  (1972),  Henke  (1972),  Brummett  (1973)  and 
Linowes  (1973).     As  to  the  significance  .of  such  an  extended  role, 
Bedford  has  written  that  "whoever  discovers  a  means  for  measuring 
objectively  the  accomplishment  of  such  organizations  [NFP]  will 
contribute  greatly  to  the  accounting  discipline"  (1962,  p.  93). 

Knighton,  one  of  the  strongest  advocates  of  extending  the 
role  of  accounting,  believes  that  the  following  action  is  required: 

We  are  not  in  business  to  spend  the  public  funds. 
Rather,  we  are  charged  with  the  responsibility  for 
using  public  resources  to  bring  about  conditions  that 
promote  the  public  good.     And  as  accountants  in  this 
all-important  endeavor,  we  must  become  as  concerned  with 
accomplishment  of  this  mission  as  with  accounting  for  the 
resources  used.   (1972,  pp.  5-6) 

Furthermore,  he  states 

The  objectives  of  government  is  not  to  earn  revenue.  The 
objective  of  government  is  to  provide  benefits  to  society. 
And  if  effort  and  accomplishment  are  to  be  compared  or 
related  here,  we  must  match  operating  expenses  with  infor- 
mation that  gives  us  an  indication  of  public  benefits. 
Only  then  can  we  really  say  something  conclusive  about 
efficiency  and  effectiveness  in  government  operations.... 

We  must  learn  to  design  and  operate  systems  that  report 
information  on  outputs  and  accomplishments.  [Emphasis 
added]   (1972,  pp.  7-8) 
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However,  recommendations  for  the  extension  of  the  accountant's 
scope  beyond  financial  accounting  are  not  limited  to  members  of  the 
accounting  discipline.     Among  others,  Lazarsfeid  (1971),  a  sociologist, 
and  Churchman  (1971),  a  management  science  philosopher,  have  urged 
the  accountant  to  serve  society  better  by  moving  beyond  financial 
accounting  into  social  accounting. 

Although  the  exact  role  the  accountant  will  play  in  the  devel- 
opment of  performance  measures  is  presently  unsettled,  there  are 
indications  of  a  willingness  on  the  part  of  certain  members  of  the 
accounting  profession  to  move  beyond  mere  dollar  accountability. 
For  example,  a  survey  of  big  eight  CP. A.  firms  by  the  Committee 
on  Measures  of  Effectiveness  for  Social  Programs  revealed  considerable 
enthusiasm  for  undertaking  social  program  evaluation  engagements 
(1972,  p.  387).     Another  example  is  provided  by  the  General  Accounting 
Office  which  has  become  active  in  evaluating  the  efficiency  and 
effectiveness  of  federal  agencies.     Its  new  Standards  for  Audit  of 
Governmental  Organizations,  Programs,  Activities  and  Functions  repre- 
sents a  significant  movement  beyond  financial  accounting: 

This  demand  (from  public  officials,  legislators  and  the 
general  public)  for  information  has  widened  the  scope  of 
governmental  auditing  so  that  such  auditing  no  longer  is 
a  function  concerned  primarily  with  financial  operations. 
Instead,  governmental  auditing  now  is  also  concerned  with 
whether  governmental  organizations  are  achieving  the 
purpose  for  which  programs  are  authorized  and  funds  made 
available,  are  doing  so  economically  and  efficiently  and 
are  complying  with  applicable  laws  and  regulations.  (1972, 
p.  i) 

Although  the  need  is  recognized,  thus  far  performance  measure- 
ment in  the  NFP  area  has  been  little  more  than  an  attractive  slogan. 
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Operational  performance  measurement  systems  are  virtually  non-existent. 
The  accounting  literature  specifically  and  the  social  science  liter- 
ature in  general  reveal  activity  primarily  at  a  conceptual,  norma- 
tive level.     Empirical  studies  are  needed  in  order  to  develop  and 
refine  performance  methodologies  and  to  provide  evidence  on  the 
feasibility  and  efficacy  of  performance  measurement  in  the  NFP 
area.     The  failure  of  such  research  to  be  forthcoming  will  probably 
result  in  a  dissipation  of  the  energy  and  enthusiasm  which  have 
been  generated  by  the  numerous  heuristic  expressions  of  interest 
in  performance  measurement.     The  warning  of  the  Committee  on 
Accounting  for  Human  Resources  appears  especially  applicable  to  the 
present  state  of  development  of  performance  measurement  systems: 

Perhaps  the  most  important  task  facing  those  who  wish  to 
do  advance  work  in  accounting  for  human  resources  [substitute 
performance  measurement]  stems  from  the  necessity  to  demon- 
strate the  usefulness  of  HRA  [performance  measurement] 
systems.     Unless  empirical  data  from  organizations  using 
HRA  systems  are  collected,  analyzed  and  published,  the 
attractiveness  of  current  theoretical  arguments  for  HRA 
may  soon  lose  their  glamour.     (1974,  p.  124) 

Chapter  Preview 
In  this  chapter  the  purpose  of  and  need  for  this  research  was 
set  forth. 

In  Chapter  II  a  synthesis  of  the  present  state  of  performance 
measurement  in  the  NFP  area  will  be  presented.     This  synthesis  entails 
(1)  an  examination  of  methodologies  which  have  been  proposed  for  use 
in  measuring  performance  in  the  NFP  area  and  (2)  a  discussion  of 
the  extent  to  which  these  methodologies  have  been  applied.  The 
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usefulness  of  these  methodologies  for  this  research  will  also  be 
discussed . 

In  Chapter  III  the  performance  measurement  model  developed 
for  evaluating  the  performance  of  the  GRD  is  presented  and  the  spe- 
cific procedures  required  to  operationalize  this  model  are  delineated. 

In  Chapter  IV  the  methodologies  to  be  employed  in  developing 
and  testing  the  model  are  discussed. 

In  Chapter  V  the  results  of  collecting  and  analyzing  the  data 
specified  by  the  performance  measurement  model  will  be  presented. 

In  Chapter  VI  the  conclusions  reached  will  be  stated.  Problems 
encountered  and  recognized  deficiencies  will  be  discussed.     The  appli- 
cability of  the  model  to  entities  other  than  recreation  departments 
will  be  assessed.     Recommendations  for  future  research  will  be  pro- 
posed. 


CHAPTER  II 


THE  PRESENT  STATE  OF  PERFORMANCE  MEASUREMENT 
IN  THE  NOT-FOR-PROFIT  AREA 


Introduction 

As  indicated  in  Chapter  I  performance  measurement  in  the  NFP 
area  is  in  its  infancy.     Although  the  accountant  has  been  directly 
involved  with  NFP  organizations  for  many  years,  he  has  almost  totally 
neglected  the  area  of  performance  measurement. 

Fortunately,  members  of  other  disciplines  have  attempted  to 
develop  methods  for  measuring  the  performance  of  NFP  activities. 
It  is  therefore  appropriate  that  we  examine  these  methods  in  order 
(1)  to  indicate  what  has  previously  been  done  in  the  area  and  there- 
by present  a  clearer  view  of  the  current  state  of  performance  measure 
ment  and  (2)  to  assess  the  usefulness  of  the  various  methods  for 
the  design  and  development  of  a  performance  measurement  model  for 
the  GRD,     The  methods  to  be  examined  are 

1.  Pareto-optimality 

2.  Cost-benefit  analysis 

3.  Cost-effectiveness  analysis 

4.  Planning-Programming-Budgeting 

5.  Social  indicators 

6.  Experimental  and  quasi-experimental  research  designs 
Because  of  its  contribution  to  this  research  effort,  the  devel- 
opment of  a  social  service  measurement  model  for  the  Cleveland 
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Jewish  Coramunity  Federation  (Mantel  et  al. ,  1972)  will  also  be  dis- 
cussed.    Finally,  political  rationality,  a  school  of  thought  crit- 
ical of  the  formal,  rational  methods,  will  be  examined. 

Pareto-Optimality 

One  discipline  which  has  long  been  concerned  with  optimal 
performance  is  economics.    Whether  from  the  point  of  view  of  the 
individual,  the  firm  or  society,  economists  have  sought  to  determine 
how  resources  can  be  allocated  so  as  to  produce  maximum  welfare. 

The  branch  of  economics  which  takes  the  broad  social  pers- 
pective is  welfare  economics,  whose  objective  is  "the  evaluation 
of  the  social  desirability  [emphasis  added]  of  alternative  economic 
states"  (Hendersen  and  Quandt,  1971,  p.  254),     Tlie  desired  state 
is  the  one  for  which  social  welfare  is  at  a  maximum. 

This  optimal  state  cannot  be  identified,  however,  because  the 
utilities  of  individuals  are  not  comparable  and  therefore  there  is 
presently  no  way  to  meaningfully  relate  them  in  a  single  social 
welfare  function.    As  a  result  welfare  economics  has  come  to  be 
based  primarily  on  the  concept  of  Pareto-optimality  (Hendersen  and 
Quandt,  1971,  p.  255). 

A  state  is  Fareto-optimal  if  no  reorganization  would  result 
in  someone  being  better  off  without  making  someone  else  worse  off. 
If  a  state  is  not  Pareto-optimal ,  it  is  possible  to  reorganize  it 
with  a  concomitant  increase  in  social  welfare.     Thus  Pareto-optimal- 
ity can  serve  as  a  social  welfare  criterion. 
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Unfortunately,  its  usefulness  as  such  a  criterion  has  proved 
very  limited  because 

1.  it  is  extremely  difficult,  if  not  impossible,  to  assess 
the  impact  of  a  reorganization  on  the  utility  of  each  member  of 
society 

2.  most  reorganizations  result  in  some  of  society's  members 
being  made  better  off  while  others  are  made  worse  off 

3.  utility,  itself,  is  an  abstract  concept  which  has  never 
been  successfully  quantified. 

To  make  Pareto-optimality  more  useful,  the  compensation  principle 
was  introduced  (Henderson  and  Quandt,  1971,  p.  279).     According  to 
this  principle,  a  change,  in  which  gainers  would  be  able  to  compen- 
sate the  losers  so  that  all  individuals  in  society  would  either 
favor  the  change  or  be  indifferent  to  it,  is  desirable.  However, 
since  compensation  principles  generally  refer  to  potential  rather 
than  actual  compensation,  "the  compensation  criteria  imply  inter- 
personal comparisons  of  utility  that  most  economists  strive  to  avoid" 
(Ibid. ,  p.  280) . 

Welfare  economics  in  general  and  Pareto-optimality  in  partic- 
ular are  presently  of  limited  value  to  those  desiring  to  assess  the 
performance  of  NFP  entities.    According  to  Cohn,  "the  area  of  welfare 
economics  cannot  offer  highly  optimistic  grounds  for  the  choice  of 
ideal  policy  criteria  that  are  designed  to  maximize  welfare"  (1972, 
p.  45).    As  a  result,  decision  makers  have  had  to  turn  to  sub-optimal 
techniques,  such  as  cost-benefit  analysis,  for  guidance. 
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Cost-Benefit.  Analysis 

Definition  and  Purpose 

Cost-benefit  analysis,  as  an  aid  to  optimal  performance  in 
the  NFP  area,  is  a  method  of  evaluating  and  ranking  in  terms  of 
economic  desirability  projects  posited  to  be  fruitful  in  attaining 
certain  social  goals.     Implementation  of  a  cost-benefit  analysis 
requires  for  each  project 

1.  a  specification  of  all  costs  and  benefits 

2.  a  valuation  of  the  specified  costs  and  benefits 

3.  a  determination  of  an  appropriate  discount  rate 

4.  specification  of  relevant  constraints  (Prest  and  Turvey, 
1965,  p.  158). 

Using  the  above  information  the  net  present  value  of  a  project  can 
be  computed.     A  project  merits  consideration  only  if  its  net  present 
value  is  positive.     (A  negative  net  present  value  implies  that  the 
social  opportunity  cost  exceeds  the  social  value  of  the  benefits.) 
If  several  projects  are  being  considered,  they  can  be  ranked  in  terms 
of  their  net  present  value  which  provides  the  criterion  for  choosing 
the  most  desirable  project — the  one  with  the  largest  net  present 
value.     Ideally,  if  all  projects  being  considered  were  perfectly 
divisible,  resources  could  be  allocated  among  the  various  projects 
so  that  the  ratio  of  marginal  benefits  to  marginal  costs  were  equal 
for  all  projects. 

In  addition  to  a  positive  net  present  value,  if  cost-benefit 
analysis  is  to  serve  as  an  ideal  welfare  criterion,  certain  stringent 
conditions  r..ast  be  satisfied  (Krutilla,  1966).     These  are 
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1.  opportunity  costs  are  borne  by  beneficiaries  in  such  wise 
as  to  retain  the  initial  income  distribution 

2.  the  initial  income  distribution  is  in  some  sense  'best' 

3.  the  marginal  social  rates  of  transformation  between  any 
two  commodities  are  everywhere  equal  to  their  corresponding 
rates  of  substitution  except  for  area  (s)  justifying  the  in- 
tervention in  question.   (Krutilla,  1966,  p.  177) 

In  addition  to  limitations  produced  by  the  absence  of  the  pre- 
ceding conditions,  significant  practical  implementation  problems  also 
exist : 

1.  It  is  often  difficult  to  Identify  all  relevant  benefits 
and  costs  and  even  where  they  can  be  identified,  quantification  in 
dollar  terms  may  not  be  possible. 

2.  Future  benefits  and  costs  are  uncertain. 

3.  Net  present  values  are  often  quite  sensitive  to  the  discount 
rate  chosen — presently  agreement  does  not  exist  as  to  how  an  appro- 
priate discount  rate  is  to  be  selected. 

4.  The  need  to  adjust  market  prices  for  market  imperfections 
and  anticipated  price  changes  introduces  considerable  uncertainty 
into  the  analysis. 

5.  Market  prices  cannot  be  used  to  value  benefits  which  cannot 
be  marketed — the  collective  goods  problem. 

In  view  of  such  difficulties,  the  results  of  a  cost-benefit  analysis 
will  generally  be  imprecise.     According  to  Dorfman,  "The  debate  about 
benefit-cost  analysis  centers  on  the  question  of  whether  the  social 
value  of  benefits  can  be  estimated  reliably  enough  to  justify  the 
trouble  and  effort  involved  in  a  benefit-cost  computation"  (1965,  p.  8). 
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Despite  its  (cost-benefit  analysis)  limitations,  Krutilla  believes 
that 

Since  the  alternative  is  not  to  retire  to  inactivity  but, 
rather,  to  reach  decisions  in  the  absence  of  analysis,  we 
may  take  some  comfort  from  the  belief  that  thinking  syste- 
matically about  problems  and  basing  decisions  on  such  analysis 
are  likely  to  produce  consequences  superior  to  those  that 
would  result  from  purely  random  behavior.   (1966,  p.  189) 

Prest  and  Turvey,  while  fully  cognizant  of  the  problems  of  cost-benefi 

analysis,  also  see  certain  advantages  in  its  application; 

1.  ,..it  forces  those  responsible  to  quantify  costs  and 
benefits  as  far  as  possible  rather  than  rest  content  with 
vague  qualitative  judgements  or  personal  hunches..,. 

2,  ,..it  has  the  very  valuable  by-product  of  causing  ques- 
tions to  be  asked. . .which  would  otherwise  not  have  been 
raised, . . , 

-  ''     3,     ...even  if  cost-benefit  analysis  cannot  give  the  right 
answers,  it  can  sometimes  play  the  purely  negative  role  in 
screening  projects  and  rejecting  those  answers  which  are 
obviously  less  promising.   (1965,  p,  202) 

Applications 

That  cost-benefit  analysis  is  not  a  new  idea  is  evident  from 
an  examination  of  the  River  and  Harbor  Act  of  1902  which  required 
that  the  desirability  of  river  and  harbor  projects  be  determined  by 
consideration  of  the  amount  of  commerce  benefited  and  the  cost  (Prest 
and  Turvey,  1965,  p.  155).     However,  it  is  only  in  recent  years  that 
the  technique  has  received  widespread  interest  and  attention.  Prest 
and  Turvey  attribute  the  heightened  interest  to  (1)  the  growth  of 
large  investment  projects,   (2)  the  growth  of  the  public  sector  and 
(3)  the  development  of  operations  research  and  systems  analysis  (1965, 
p.  156). 
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Recalling  that  for  the  successful  implementation  of  cost-benefit 
analysis,  benefits  and  cost  must  be  both  Identified  and  valued,  it  is 
understandable  that  cost-benefit  analysis  has  been  most  extensively 
applied  to  the  areas  of  water  resources  and  transportation.  Projects 
in  these  areas  (irrigation,  hydroelectric,  flood  control,  road, 
railway  and  inland  waterway)  generally  yield  some  directly  identi- 
fiable outputs  (benefits)  which  are  usually  valued  in  the  marketplace. 
Cost-benefit  analysis  has  also  been  applied,  on  a  much  more  limited 
scale,  to  the  areas  of  land  usage,  health,  education,  research  and 
development  and  defense.     For  areas  such  as  thesa,  however,  it  is 
much  more  difficult  to  identify  the  benefits  produced  by  a  partic- 
ular project.     Furthermore,  the  benefits  are  often  not  valued  in 
the  private  marketplace.     Cost-benefit  ratios  produced  for  projects 
in  these  areas  will  generally  be  more  unreliable  than  those  for  the 
areas  of  water  resources  and  transportation. 

Use  In  This  Research 

Although  cost-benefit  analysis  appears  to  be  a  useful  method 
for  evaluating  projects  in  certain  areas,  it  is  of  dubious  value 
for  this  research.     For  recreation  in  general,  the  relationships 
between  recreational  activities  and  social  well-being,  physical  and 
mental  health,  productivity,  crime,  property  values  and  economic 
growth  have  not  been  empirically  established.     More  specifically, 
data  on  the  assumed  benefits  provided  the  Gainesville  community  by 
the  City's  recreation  programs  do  not  exist  and  market  values  for 
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these  benefits  are  virtually  non-existent.     In  view  of  these  facts, 
reliable    cost-benefit  ratios  could  not  be  produced. 

Furthermore,  unless  opportunity  costs  are  borne  by  beneficiaries, 
the  provision  of  public  recreation  programs  will  alter  the  distri- 
bution of  income  in  the  Gainesville  community.     Now  since  most  of 
the  expense  of  recreation  is  paid  from  ad  valorem  taxes  and  not  from 
recreation  fees  based  on  usage  of  programs  and  facilities,  it  is 
very  unlikely  that  opportunity  costs  are  being  borne  by  beneficiaries. 
For  example,  upper  income  groups,  who  have  more  wealth  but  less 
need  for  public  recreation,  would  appear  to  be  subsidizing  lower 
income  groups,  who  have  greater  need  but  less  wealth.     Such  distri- 
bution effects  limit  the  usefulness  of  cost-benefit  analysis  as  an 
evaluation  tool. 

Cost-Effectiveness  Analysis 
Definition  and  Purpose 

Since  cost-benefit  analysis  requires  market  prices  for  valu- 
ation of  benefits,  it  is  not  applicable  in  areas  where  such  prices 
do  not  exist.  To  provide  a  guide  to  rational  decision  making  in 
these  areas,  a  technique  entitled  cost-effectiveness  analysis  has 
been  developed.  Utilizing  this  technique,  it  is  possible  to  maxi- 
mize a  certain  output  (non-dollar)  per  a  constant  cost  or  to  mini- 
mize the  cost  of  producing  a  constant  output. 

In  order  to  implement  a  cost-effectiveness  analysis,  project 
outputs  must  be  specifiable  and  project  costs  must  be  capable  of 
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valuation  at  market  prices. 

Since  future  outputs  and  costs  are  uncertain,  the  results  of 
a  cost-effectiveness  analysis  will  be  imprecise  and  must  be  inter- 
preted with  caution.     Because  outputs  are  not  valued,  the  technique 
cannot  provide  an  index  of  social  desirability  (Cohn,  1972,  p,  73), 
Furthermore,  the  absence  of  the  dollar  as  a  common  metric,  means 
that  the  technique  cannot  be  used  to  guide  choices  between  different 
goals . 

Applications 

Althought  the  ultimate  origins  of  cost-effectiveness  analysis 
lie  in  economic  production  and  utility  theory,  its  birth  as  an  analyt- 
ical technique  must  be  assigned  to  the  efforts  of  defense  research 
during  and  after  World  War  II  (Goldman,  1967,  p,  v) .    As  might  be 
expected,  its  greatest  use  has  been  in  the  area  of  national  defense. 

Use  In  This  Research 

Since  cost-effectiveness  analysis  does  not  require  the  valu- 
ation of  benefits,  it  has  potential  for  use  in  this  research  project, 
provided  the  output  of  the  GRD  can  be  defined  in  some  meaningful 
way. 

Planning-Programming-Budgeting 
Definition  and  Purpose 

Planning-Programming-Budgeting  (PPB)  is  an  approach  to  budgeting 
which  seeks  to  extend  the  budgetary  process  beyond  the  area  of  input 
control  and  into  the  area  of  efficient  and  effective  resource  manage- 
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ment,     It  focuses  on  the  program  as  the  basis  for  planning  and  bud- 
geting— the  program  being  the  total  resource  commitment  for  achieving 
some  common  objective.     The  application  of  PPB  entails  (1)  setting 
goals,   (2)  determining  the  alternative  programs  for  achieving  these 
goals,   (3)  using  some  decision  rule  (such  as  provided  by  a  cost-benefit 
or  a  cost  effectiveness  analysis)  to  select  the  optimal  program, 
and  (4)  allocating  resources  to  the  optimal  programs. 

PPB  represents  a  significant  departure  from  the  traditional 
line  item  method  of  budgeting.    Whereas  in  traditional  budgeting 
the  guide  to  current  resource  allocation  is  the  previous  year's  budget, 
the  guide  to  resource  allocation  in  a  PPB  approach  is  the  goal  to 
be  accomplished  and  the  program  resources  thought  to  be  required 
for  achieving  this  goal. 

Another  significant  and  important  difference  between  PPB  and 
traditional  budgeting  is  that  under  PPB  the  time  period  considered 
extends  beyond  the  single  budget  year.     PPB  involves  multi-year 
planning  and  a  multi-year  budget. 

Applications 

During  the  1950 's,  the  Rand  Corporation  developed  PPB  for  use 
in  defense  budgeting.     In  1961  under  Defense  Secretary  Robert  McNamara's 
direction  and  support  PPB  was  introduced  into  the  Department  of 
Defense.     McNamara  required  that  the  defense  budget  for  fiscal  1963 
be  formulated  in  terms  of  major  programs  and  weapon  systems  (Held, 
1970,  p.  13).     "The  results  of  this  reorganization  and  of  the  evalu- 
ations it  has  made  possible,  led  to  recommendations  that  the  approach 
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be  extended  to  civilian  affairs"  (Ibid.,  p.  13). 

In  1965  President  Lyndon  Johnson  announced  that  all  federal 
agencies  would  be  required  to  use  PPB.     According  to  the  President, 
each  federal  department  was  required  to 

Develop  its  objectives  and  goals,  precisely  and  carefully; 

Evaluate  each  of  its  programs  to  meet  these  objectives, 
weighing  the  benefits  against  the  costs; 

Examine  in  every  case,  alternative  means  of  achieving 
these  objectives; 

Shape  its  budget  request  on  the  basis  of  this  analysis 
and  justify  that  request  in  the  context  of  a  long  range 
program  and  financial  plan.     (Lyden  and  Miller,  1970,  p.  5) 

Given  all  the  fanfare  and  optimism  with  which  PPB  was  introduced  into 

the  federal  government*,  it  surely  must  have  appeared  to  some  that 

a  new  era  of  rationality  had  engulfed  the  public  expenditure  process. 

This  euphoria,  however,  was  to  be  short  lived.     The  difficulties 

of  translating  a  conceptual  PPB  into  a  functioning  viable  PPB  were 

to  prove  insurmountable. 

Schick,  in  a  review  of  PPB,  noted  that 

The  publicity  has  outdistanced  the  performance  by  a  wide  margin. 
In  the  name  of  analysis,  bureaus  have  produced  reams  of  un- 
supported, irrelevant  justification  and  description.  As 
Shumpeter  said  of  Marxism:     it  is  preaching  in  the  garb  of 
analysis.     (1969,  p.  149) 

Stanley  Botner  in  another  review  of  PPB  reported  that 


*Although  a  few  state  and  local  governments  have  introduced  PPB  into 
their  budget  systems,  PPB  should  be  viewed  primarily  as  a  federal 
effort. 
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In  mid-1968  the  Bureau  of  the  Budget  undertook  to  determine 
if  policy  analysis  is  performed  differently  than  it  was  be- 
fore the  advent  of  PPB.     BOB  researchers  conducted  a  study 
of  implementation  and  utilization  of  PPB  by  16  domestic 
federal  agencies.     Their  findings  led  them  to  conclude  that 
most  agencies  do  not  perform  the  planning,  programming,  and 
budgeting  functions  much  differently  than  they  did  before 
the  introduction  of  PPB.     (1970,  p.  423) 

Furthermore,  "An  analysis  of  the  results  of  recent  studies,  dis- 
cussions with  Budget  Bureau  and  other  officials  and  testimony  before 
congressional  subcommittees  leads  one  to  conclude  that  PPB  has  thus 
far  been  rather  ineffectual  as  a  presidential  staff  tool"  (Ibid.). 

In  view  of  its  lack-luster  performance,  it  was  not  surprising 
that  a  change  in  administration's  (from  Johnson  to  Nixon)  sealed 
the  doom  of  PPB.     In  June,  1971  a  memorandum  from  the  Office  of 
Management  and  Budget  stated  that 

Agencies  are  no  longer  required  to  submit  with  their  budget 
submissions  the  multi-year  program  and  financing  plans,  pror- 
grams  memoranda  and  special  analytical  studies.,,  or  the 
schedules,.,  that  reconcile  information  classified  according 
to  their  program  and  appropriation  structures.     (Office  of 
Management  and  Budget  Transmittal  Memorandum  No.  38,  June 
21,  1971) 

"By  these  words,"  according  to  Schick  "PPB  became  an  unthing" 
(1973,  p.  146). 

Post  mortems  as  to  the  cause  of  death  (e.g.,  Schick,  1973; 
Tiller,  1972)  offer  the  following  reasons  for  PPB's  failure; 

1.  PPB  was  introduced  on  too  large  a  scale  and  without  suffi- 
cient advance  pi"eparation  and  planning. 

2.  PPB  was  suddenly  thrust  upon  the  various  federal  agencies 
without  their  participation  in  the  decision  to  implement  PPB.  Many 
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agency  leaders  were  thus  initially  alienated  from  PPB  and  never 
became  committed  to  it. 

3.  There  never  existed  a  sufficient  number  of  personnel  capable 
of  providing  the  information  and  analysis  needed  for  a  total  federal 
PPB.     At  its  peak  the  Bureau  of  the  Budget  staff  responsible  for 

PPB  in  all  the  federal  agencies  numbered  less  than  12. 

4.  The  data  needed  for  various  analytical  purposes  simply 
did  not  exist. 

5.  PPB  ignored  the  traditional  budgeting  process-yet  Congress 
and  most  of  the  federal  agencies  continued  to  use  this  process  as 
the  one  for  decision  making. 

It  does  not  appear  that  PPB  failed  because  of  conceptual  unsound- 
ness.    Rather  its  failure  must  be  attributed  to  deficiencies  in  imple- 
mentation, disregard  of  the  traditional  budgeting  process  and  a 
lack  of  sufficient  resources  for  administration.    Although  PPB,  as 
a  total  federal  effort,  has  ceased  to  exist,  it  has  served  to  stim- 
ulate a  great  interest  in  program  evaluation  and  economic  rationality. 

Use  In  This  Research 

The  City  of  Gainesville  uses  traditional,  line  item  budgeting. 
A  system  comparable  to  PPB  has  never  been  attempted  and  it  is  well 
beyond  the  scope  of  this  research  project  to  attempt  to  install  a 
PPB  system  for  the  GRD. 

However,  it  appears  that  the  program  is  a  practical  and  useful 
unit  of  account.     Therefore  major  recreation  programs  will  be  identi- 
fied and  measures  of  program  input  and  output  will  be  produced. 
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Social  Indicators 

Definition  and  Purpose 

According  to  Bauer,  social  indicators  are  "statistics,  statis- 
tical series,  and  all  other  forms  of  evidence  that  enable  us  to  assess 
where  we  stand  and  are  going  with  respect  to  our  values  and  goals 
and  to  evaluate  specific  programs  and  determine  their  impact"  (1967, 
p.  1).     Thus  defined,  economic  indicators  are  a  subset  of  social 
indicators.     However,  it  was  dissatisfaction  with  the  narrowness 
of  economic  indicators  (e,g,,  gross  national  product,  unemployment) 
that  has  fostered  so  much  interest  in  the  broader  concept  of  social 
indicators ; 

...the  highly  quantitative  economic  data  in  today's  econ- 
omic survey  documents  tend  to  detract  attention  from  ideas 
that  cannot  be  so  readily  expressed  in  quantitative  terms.... 
Economic  statistics,  as  a  whole,  emphasize  the  monetary  value 
of  goods  and  services.     By  so  doing,  they  tend  to  discrimi- 
nate against  nonmonetary  values  and  against  public  services 
for  which  costs  invariably  serve  as  surrogates  of  output 
value.     Because  figures  on  health  and  life  expectancy  are 
not  directly  Incorporated  in  national  economic  accounts, 
progress  in  these  areas  may  be  seriously  ignored,  either 
in  formulating  goals  or  in  evaluating  performance.... 

In  short,  national  economic  accounting  has  promoted  a 
"new  Philistinism"  -an  approach  to  life  based  on  the  prin- 
ciple of  using  monetary  units  as  the  common  denominator  of 
all  that  is  important  in  human  life. 


This  bias  can  be  overcome  only  by  persistent  efforts  to 
develop  broader  models  that  include  many  more  variables  than 
those  thus  far  used  by  economists.   (Gross,  1967,  pp.  167-168) 

The  consensus  of  opinion  of  the  advocates  of  systems  of  social  indi- 
cators is  that  economic  Indicators,  while  necessary,  are  not  suf- 
ficient for  the  evaluation  of  social  systems  and  the  measuring  of 
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progress  towards  social  system  goals. 

Some  of  the  areas  which  have  been  identified  as  being  extremely 

important  to  society's  welfare  but  which  are  not  measured  by  economic 

indicators  are  (1)  health,   (2)  social  mobility,   (3)  condition  of 

physical  environment  and  human  habitat,   (4)  public  order  and  safety, 

(5)  learning,  science,  art,  leisure,  and  (6)  freedom  and  justice. 

Concerning  indicators  of  progress  in  such  areas  as  those  noted  above, 

the  following  statements  from  Toward  a  Social  Report  are  apropos: 

The  Nation  [United  States]  has  no  comprehensive  set  of  statis- 
tics reflecting  social  progress  or  retrogression.     There  is  no 
Government  procedure  for  periodic  stocktaking  of  the  social 
health  of  the  Nation.     The  Government  makes  no  Social  Report 
[emphasis  added] . (1969 ,  p.  xi) 

Applications 

Although  comprehensive  systems  of  social  indicators  have  been 
formulated  (e.g..  Gross,  1967;  Terleckyj ,  1969),  the  transition  from 
a  conceptual  to  an  operational  system  remains.    At  the  conceptual 
level,  the  problems  of  (1)  goal  specification  and  agreement,  (2) 
identification  and  validation  of  measures  of  goal  attainment,  and 
(3)  discovering  the  relationships  between  inputs  (resources  used) 
and  outputs  (posited  as  conducive  to  goal  attainment)  can  be  ignored 
or  assumed  away.     But  at  the  operational  level,  these  problems  must 
be  addressed  successfully.     This  will  require  great  effort,  expense 
and  time. 


Use  In  This  Research 

In  this  research  project  an  attempt  will  be  made  to  develop 
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and  validate  indicators  believed  useful  for  assessing  the  performance 
of  the  GRD,     In  a  sense,   the  model  to  be  developed  is  an  information 
system  which  provides  indicators  of  program  cost,  usage,  quality 
and  importance. 

Experimental  and  Quasi-Experimental 
Designs  for  Program  Evaluation 

Definition  and  Purpose 

The  current  interest  in  program  evaluation  has  resulted  in 

an  advocacy  by  some  social  scientists  of  the  use  of  experimental 

and  quasi-experimental  research  designs*  for  the  purpose  of  deter^ 

mining  program  effectiveness  (e.g.,  Campbell,  1969;  Rossi  and 

Williams,  1972;  Campbell,  1974).  According  to  Campbell 

The  United  States  and  other  modern  nations  should  be  ready 
for  an  experimental  approach  [emphasis  added]  to  social 
reform,  an  approach  in  which  we  try  out  new  programs  designed 
to  cure  specific  social  problems,  in  which  we  learn  whether 
or  not  these  programs  are  effective,  and  in  which  we  retain, 
imitate,  modify  or  discard  them  on  the  basis  of  apparent 
effectiveness  on  the  multiple  imperfect  criteria  available. 
(1969,  p.  409) 

True  experimental  design  form  the  basis  for  much  of  the  "experimental 
approach"  referred  to  by  Campbell.     Such  designs  require 

1.  manipulation  of  independent  variable(s)  by  an  investi- 
gator (such  manipulation  may  take  the  form  of  exposing  one  group 
(the  treatment  group)   to  a  particular  social  program  while  with- 


*For  a  comprehensive  discussion  of  experimental  and  quasi-experimental 
research  designs,  along  with  the  various  threats  to  Internal  and  ex- 
ternal validity,  the  reader  may  consult  Campbell  and  Stanley,  1963, 
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holding  it  from  another  (the  control  grvnp)  or  it  may  mean  exposing 
each  group  to  a  different  program) 

2.  establishing  statistical  equivalency  between  treatment 
and  control  groups  by  randomly  assigning  a  pool  of  subjects  (ideally 
selected  at  random  from  a  larger  population)  to  each  group 

3.  observing  and  measuring  the  variance  between  the  treat- 
ment and  control  groups. 

The  great  power  of  the  experimental  design  lies  in  its  control  of 
alternative  explanations  (rival  hypotheses)  of  the  cause  of  the 
variance  observed  between  the  treatment  and  control  groups.  Unless 
these  alternative  explanations  are  eliminated,  the  variance  observed 
cannot  be  warrantedly  attributed  to  the  treatment  (social  program) 
and  the  effectiveness  of  the  treatment  will  remain  unknown. 

Situations  often  exist  for  which  equivalent  treatment  and 
control  groups  are  not  possible.     Therefore  one  of  the  basic  require- 
ments for  the  experimental  design  is  lacking.     In  such  situations, 
however,  it  is  often  possible  to  achieve  something  comparable  to 
the  experimental  design  by  utilizing  non-random  control  groups  or 
by  using  the  treatment  group  as  its  own  control.     Such  research 
designs  are  referred  to  as  quasi-experimental  designs.     Since  for 
these  designs,  full  experimental  control  is  lacking,  it  is  essential 
that  the  researcher  be  fully  aware  of  the  alternative  explanations 
as  to  the  cause  of  observed  variance. 


Applications 

Although  the  canon  of  controlled,  comparative  experimentation 
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was  established  by  Fisher  in  the  1920's,  subsequent  experiments 
in  the  Fisherian  tradition  have  been  limited  primarily  to  the  labor- 
atory and  the  agricultural  experiment  station  (Campbell,  1969,  p.  425). 
Rossi,  while  noting  that  "there  exist  elegant  models  for  carrying 
out  evaluation  studies,  derived  mainly  from  the  controlled  experiment 
tradition,"  finds  that  "there  are  almost  no  examples  of  evaluation 
studies  of  current  programs  which  have  followed  these  models  with 
any  appreciable  degree  of  fidelity"  (1972,  p.  29). 

The  greatest  obstacle  to  the  use  of  experimental  designs  for 
evaluating  social  programs  appears  to  be  the  reluctance  of  program 
administrators  to  allow  subject  to  be  assigned,  at  random,  to  the 
treatment  and  control  groups.     Randomization  is  seen  as  inhumane 
and  it  conjures  up  notions  of  eccentric  scientists. 

Another  difficulty  lies  in  the  fact  that  for  some  programs 
randomization  is  not  possible  either  because  potential  subjects  are 
limited  in  number  or  because  the  program  is  an  all  or  nothing  affair 
(not  possible  to  have  a  control  group) , 

Such  problems  in  utilizing  the  true  experiment  have  resulted 
in  recommendations  that  the  quasi-experimental  designs  be  used  where 
it  is  impractical  or  impossible  to  utilize  the  experimental  designs 
(e.g.,  Campbell  and  Stanley,  1963;  Campbell,  1969). 

Use  In  This  Research 

Experimental  and  quasi-experimental  designs  are  powerful  methods 
for  determining  the  impact  and  effectiveness  of  programs.  However, 
their  use  -^f^quires  direct  intervention  with  presently  existing  pro- 
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cesses.     Since  intervention  with  the  current  recreation  programs 
of  the  GRD  is  outside  the  scope  of  this  research  project ,  experi- 
mental and  quasi-experimental  designs  are  inappropriate  for  this  research, 

A  Social  Service  Measurement  Model  for  the 
Cleveland  Jewish  Community  Federation 

Interest  in  measurement  models  for  the  evaluation  of  NFP  organ^r- 
izations  has  been  exhibited  in  recent  years.     One  model  of  particular 
interest  and  merit  was  developed  by  an  interdisciplinary  research 
team  at  Case  Western  Reserve  University.     The  research  team  undertook 
(1)  "to  develop  a  consistent,  relevant  and  reasonably  reliable  set 
of  data  on  the  services  offered"  by  the  21  agencies  of  the  Cleveland 
Jewish  Community  Federation  (JCF)  and  (2)  "to  develop  a  model  or 
set  of  models  that  would  produce  a  standardized  assessment  of  the 
agencies  and  their  associated  services;   (Mantel  et^  al^.  j  1972,  pp,  l'r-2) , 

As  a  first  step  in  defining  the  output  of  the  JCF  system, 
answers  to  the  following  questions  were  sought; 

1.  What  are  the  goals  of  the  system? 

2.  What  services  are  being  delivered? 

3.  Who  received  the  services? 

4.  Which  agencies  deliver  the  various  service?     (Ibid,,  p,  8) 
General  goals  were  first  identified.     Then  answers  to  the  last  3 
questions  enabled  a  3  dimensional  (init  of  account-^the  service-client-^ 
agency  package  to  be  produced.     The  development  of  this  unit  of  ac- 
count permitted  the  "research  task  to  be  redefined  as  the  development 
of  a  measure  of  the  'output'  associated  with  each  service-client- 
agency  package"  (Ibid.,  p.  11).     Output  was  to  be  defined  in  terms 

of  (1)  the  number  of  clients  served,  (2)  the  amount  of  service  time  re- 
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ceived,   (3)  the  value  of  the  physical  throughput  (items  1  and  2), 
and  (4)  the  quality  of  the  physical  throughput. 

The  physical  throughput  measures  were  obtained  from  agency  records. 
The  value  measure,  an  indicant  of  the  relative  importance  of  each 
service-client-agency  package,  was  considered  "analogous  to  that  of 
price  in  the  economic  measurement  of  industrial  output"  (Mantel 
et_  al.  ,  1972,  p.  13).     The  Delphi  technique  (see  page  71  for  a  des-^ 
cription  of  the  technique)  was  used  to  develop  value  measures  for 
each  service-client-agency  package.     Four  separate  reference  groups 
participated  in  the  Delphi  exercises  which  involved  the  assignment 
of  1  of  5  verbally  described  levels  of  importance  to  the  service- 
client-agency  packages.     Whenever  approximately  80%  of  each  group *s 
ratings  were  in  any  2  contiguous  categories  of  importance,  agree- 
ment (within  the  group)  as  to  the  value  of  the  service-client^gency 
package  was  deemed  to  exist  (Reisman  eit  al.  ,  1970,  p.  17), 

In  order  to  develop  a  measure  of  the  quality  of  service  the 
research  team  decomposed  quality  into  6  component  criteria  which 
were  in  turn  decomposed  into  several  elements.     This  rather  complin 
cated  process  involved 

1.  collection  of  element  data 

2.  specification  of  weights  (by  a  Delphi  process)  for  com- 
bining elements  into  criteria 

3.  restatement  of  criteria  scores  as  utility  scores  through 
a  set  of  transformation  functions  relating  individual  criterion 
scores  to  their  associated  utility  values  (again  a  Delphi  process 
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was  used) 

4.     specification  of  weights  (via  Delphi)  for  combining  the 
criteria  into  a  quality  score. 

Concerning  this  approach  to  developing  a  measure  of  quality,  the 
research  team  noted  that  "since  no  'true'  measure  of  quality  is 
known,  the  purpose  here  was  to  develop  indicators  of  those  aspects 
of  quality,  the  criteria,  which  appear  to  be  important  parts  of  the 
undefined  quality  whole"  (Mantel  et  al . ,  1972,  p.  16). 

The  research  team  believed  that  the  measurement  model  developed 
could  be  used  to 

1,  compare  output  data  by  service^client-agency  package  across 

time 

2,  compare  output  data  for  a  given  service-client  package 
between  agencies  offering  the  same  package 

3,  compare  output  data  for  different  service-client  packages 
either  across  time  or  between  agencies. 

Furthermore, 

actual  use  of  the  model  focuses  primarily  on  expected 
changes  in  budget  allocation.     Each  operational  change 
contemplated  by  an  agency  which  has  a  change  in  the  level 
of  resources  associated  with  it,,,  can  be  evaluated  in 
terms  of  the  expected  changes  in  the  basic  measures  of  system 
output...  These  expectations  can  be  compared  to  those  pre- 
sented by  another  agency  seeking  funds  for  its  own  proposals. 
The  budget  Committee,  which  must  make  decisions  on  how  to 
allocate  scarce  resources,  has  access  to  quantified  expec- 
tations for  each  of  the  competitors  for  resources.  (Mantel 
et  al,,  1972,  p,  30) 

Since  the  ultimate  value  of  this  or  any  other  model  for  decision 
making  must  be  ascertained  thru  its  use,  it  is  auspicious  that  the 
JCF  Board  of  Trustees  has  committed  the  JCF  to  actual  use  of  the 
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model  for  the  next  several  years. 

Although  the  pioneering  work  performed  by  the  Case  Western 
research  team  has  served  as  an  extremely  useful  guide  for  this  research, 
certain  important  differences  in  approach  and  methodology  exist  be- 
tween that  work  and  this  research.     These  will  now  be  discussed. 

In  the  JCF  model  convergence  among  rater  groups  was  used  as  the 
criterion  for  assessing  the  validity  of  value  measures.  Concerning 
the  process  of  validating  subjective  measures,  Campbell  and  Fiske 
vrrite 

For  the  justification  of  novel  trait  measures,  for  the  valid^- 
ation  of  test  interpretation,  or  for  the  establishment  of 
construct  validity,  discriminant  validation  as  well  as  con- 
vergent validation  is  required.   (1959,  p,  31) 

Both  convergent  and  discriminant  validity  are  being  utilized  in  this 

research. 

The  quality  measures  generated  in  the  JCF  model  are  the  result 
of  a  complex  process  involving  the  identification,  measurement  and 
weighting  of  criteria  hypothesized  to  contribute  to  quality.  In 
this  research,  no  attempt  will  be  made  to  identify  quality  elements 
and  then  aggregate  them  into  a  quality  measure.     Instead  a  global 
measure  of  quality  will  be  employed.     Global  measures  are  much 
simpler  to  work  with  and  existing  evidence  suggests  that  such  global 
measures  reasonably  approximate  those  obtained  thru  a  more  extensive 
process  (Lawler,  1967,  p.  370). 

In  the  JCF  model  no  attempt  was  made  to  identify  relationships 
between  measures  of  input  and  output.     Because  such  relationships 
are  believed  to  be  of  value  to  decision  makers,  an  attempt  will  be 
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made  in  this  research  to  identify  input-output  relationships  through 
the  use  of  regression  analysis. 

Political  Rationality 

Each  of  the  methods  previously  examined  represents,  to  some 
extent,  a  formalized,  rational  approach  to  decision  making  in  the 
NFP  area.  Their  proponents  believe  that  these  methods,  by  making 
explicit  the  goals,  alternatives,  cost,  benefits ,  performance  cri- 
teria, etc,  offer  a  considerable  improvement  over  decisions  based 
on  intuition,  hunch  and  past  experience, 

However,  the  value  of  the  formal  rational  approach  (first 
method)  has  been  challenged  by  a  school  of  thought  associated  pri- 
marily with  Charles  Lindblom  (1959;  1965).     Lindblom  (1959)  describes 
another  approach  (second  method)  which  he  believes  is  more  realistic 
and  better  suited  to  decision  making  in  the  NFP  area.     This  method 
of  decision  making  is  based  primarily  on  collective  wisdom  and  the 
past  experience  of  decision  makers.     Entitled  successive-limited 
comparisons,  it  is  characterized  by  the  following: 

1.  Selection  of  value  goals  and  empirical  analysis  of  the 
needed  action  are  not  distinct  from  one  another  but  are 
closely  intertwined. 

2.  Since  means  and  ends  are  not  distinct,  means-end  analysis 
is  often  inappropriate  or  limited. 

3.  The  test  of  a  "good" policy  is  typically  that  various 
analysts  find  themselves  directly  agreeing  on  a  policy  (with- 
out their  agreeing  that  it  is  the  most  appropriate  means  to 
an  agreed  objective). 
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4.  Analysis  is  drastically  limited: 

i)     Important  possible  outcomes  are  neglected, 
ii)     Important  alternative  potential  policies 
are  neglected, 
iii)     Important  affected  values  are  neglected. 

5.  A  succession  of  comparisons  greatly  reduces  or  elimi- 
nates reliance  on  theory.   (Lindblom,  1959,  p.  81) 

Concerning  these  two  approaches,  Lindblom  writes 

For  complex  problems,  the  first  [formal,  rational]  of  these 
two  approaches  is  of  course  impossible.     Although  such  an 
approach  can  be  described,  it  cannot  be  practiced  except 
for  relatively  simple  problems  and  even  then  only  in  a  some- 
what modified  form.     It  assumes  intellectual  capacities  and 
sources  of  information  that  men  simply  do  not  possess  and 
it  is  even  more  absurd  as  an  approach  to  policy  when  the  time 
and  money  that  can  be  allocated  to  a  policy  problem  is  limited ^ 
as  is  always  tha  case.     Of  particular  importance  to  public 
administrators  is  the  fact  that  public  agencies  are  in  effect 
usually  instructed  not  to  practice  the  first  method.  That 
is  to  say,  their  prescribed  functions  and  constraints-the 
politically  or  legally  possible-restrict  their  attention  to 
relatively  few  values  and  relatively  few  alternative  policies 
among  the  countless  alternatives  that  might  be  imagined.  It 
is  the  second  method  that  is  practiced. 

Curiously,  however,  the  literature  of  decision  making,  policy 
formulation,  planning  and  public  administration  formalize  the 
first  approach  rather  than  the  second,  leaving  public  adminis- 
trators who  handle  complex  decisions  in  the  position  of  prac- 
ticing what  few  preach.   (Ibid.,  p.  80) 

An  analysis  of  the  federal  budgeting  process  by  Wildavsky  (1964)  re- 
vealed that  federal  resource  allocations  decisions  follow  an  approach 
quite  similar  to  that  suggested  by  Lindblom.*    Wildavsky  (1966)  main- 
tains that  methods  which  seek  to  rationalize  the  decision  making 
process  are  generally  deficient  in  that  they  ignore  the  political 


*For  business  decisions,  Cyert  and  March  (1963)  express  a  view  of 
the  decision  making  process  which  is  quite  similar  to  Lindblom' s. 
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costs  of  decisions,     VThile  a  policy  (program,  project,  etc.)  may  be 

optimal  according  to  certain  efficiency  or  effectiveness  criteria 

(e.g,,  cost-benefit  analysis)  it  may  be  politically  non-optimal  and 

irrational,     Wildavsky  states  that 

exchange  costs  are  incurred  by  a  political  leader  when  he 
needs  the  support  of  other  people  to  get  a  policy  adopted. 
He  has  to  pay  for  this  assistance  by  using  up  resources  in 
the  form  of  favors  (patronage,  log-rolling)  or  coercive  moves 
(threats  or  acts  to  veto  or  remove  from  office).     By  supporting 
a  policy  and  influencing  others  to  do  the  same,  a  politician 
antagonizes  some  people  and  may  suffer  their  retaliation. 
If  these  hostility  costs  mount,  they  may  turn  into  re-election 
costs-actions  that  decrease  his  chances  (or  those  of  his 
friends)  of  being  elected  or  reelected  to  office.  Election 
costs,  in  turn,  may  become  policy  costs  through  inability 
to  command  the  necessary  formal  powers  to  accomplish  the 
desired  policy  objectives.   (1966,  p. 

In  light  of  the  difficulties  encountered  by  social  scientists 
seeking  to  evaluate  social  programs  (e.g.,  Rossi  and  Williams,  1972; 
Campbell,  1969)  and  the  fate  of  the  federal  PPB  (see  p.  22  ),  the 
Lindblom  thesis,  with  its  emphasis  on  the  political  nature  of  decisions, 
should  not  be  ignored. 


Summary 

This  review  of  performance  measurement  in  the  NFP  area  has 
revealed  the  existence  of  some  sophisticated  methods  for  assessing 
the  efficiency  and  effectiveness  with  which  resources  are  allocated 
towards  desired  ends.     It  also  has  revealed  that  most  of  these  methods 
are  not  widely  used.     Furthermore,  where  attempts  to  apply  such  methods 
have  been  made,  the  results  achieved  have  not  been  equal  to  expec- 
tations. 
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While  the  methods  surveyed  in  this  chapter  warrant  a  belief 
in  the  efficacy  and  feasibility  of  performance  measurement,  problems 
of  implementation  indicate  that  progress  will  come  slowly.  Researchers, 
therefore,  should  avoid  the  oversell  and  reconcile  themselves  to 
piecemeal  progress  thru  empirical  applications. 


CHAPTER  III 


DESIGN  AND  DEVELOPMENT  OF  A  PERFORMANCE 
MEASUREMENT  MODEL  FOR  THE  GAINESVILLE 
RECREATION  DEPARTMENT 

Introduction 

As  indicated  in  Chapter  II,  the  problems  encountered  in  imple- 
menting the  formal  methods  for  performance  measurement  present  a 
challenge  to  those  who  believe  in  the  efficacy  and  feasibility  of 
performance  measurement  in  the  NFP  area.     In  response  to  that  challenge 
the  main  thrust  of  this  research  project  was  conceived  to  be  the  de-^ 
velopment,  with  the  close  cooperation  of  a  NFP  organization,  of  an 
operational  performance  measurement  model. 

Once  the  decision  was  made  to  attempt  to  design  and  develop  a 
performance  measurement  model  for  a  NFP  organization,  the  cooperation 
of  a  specific  NFP  entity  had  to  be  secured. 

The  Gainesville  Recreation  Department 
Selection  of  the  Gainesville  Recreation  Department 

Since  the  researcher  had  been  working  with  the  City  of  Gainesville 
Florida  in  the  performance  of  audit  and  systems  work,  this  City  was 
a  logical  first  choice.     However,  a  model  for  the  entire  City  was 
well  beyond  the  resources  of  the  researcher.     Therefore,  the  researcher 
decided  to  select  a  single  City  department  to  work  with. 
The  Recreation  Department  was  chosen  because 
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1.  its  operational  structure  conformed  to  that  of  the  JCF — 
the  social  service  measurement  model  developed  for  the  JCF  has  served 
as  a  guide  for  this  research  (see  p.  29) 

2.  it  was  a  politically  low-key  department  and  therefore  the 
occurrence  of  events  which  could  abort  the  research  appeared  remote 

3.  performance  measurement  in  the  field  of  recreation  is  vir- 
tually non-existent  (e.g.,  Hatry  and  Dunn,  1971;  Kraus  and  Curtis, 
1973);     the  area  thus  appeared  to  offer  the  opportunity  of  challenging, 
long-term  research. 

Having  decided  upon  the  Recreation  Department,  the  Gainesville 
City  Manager  was  approached  with  the  purpose  of  the  research  and  a 
request  for  cooperation.     The  City  Manager  endorsed  the  research  and 
directed  the  researcher  to  the  GRD.     A  lengthy  interview  with  the  Di- 
rector and  Assistant  Director  of  Recreation  revealed  enthusiastic  sup- 
port for  the  research  project  and  a  promise  of  cooperation. 

Description  of  Gainesville  Recre^.ciQn  Department 

The  GRD  is  comprised  of  an  administrative  division,  a  maintenance 
division  and  4  operating  divisions— aquatics ,  athletics,  centers  and 
playgrounds.     The  4  operating  divisions  are  directly  responsible  for 
the  various  recreation  programs  provided  to  the  Gainesville  community. 
The  primary  responsibilities  of  these  operating  divisions  will  now 
be  discussed. 

AaHaLi££-     The  primary  responsibility  of  this  division  is  to 
provide  water  recreation  (primarily  swimming  pool).     Its  major  areas 


39 

of  activity  are  public  swimming,  instructional  swimming  and  competi- 
tive swimming. 

Athletics .     The  primary  responsibility  of  this  division  is 
to  provide  organized  team  sports.     Its  major  areas  of  activity  in- 
clude baseball,  basketball,  football,  Softball  and  instruction  in 
various  athletic  skills. 

Centers,     This  division  provides  indoor  activities  which  In- 
clude arts  and  crafts,  dance,  music  and  numerous  games.     The  recre- 
ation centers  also  serve  as  a  meeting  place  for  numerous  civic,  edu- 
cational and  social  organizations. 

Playgrounds.  This  division  provides  supervised  social  activi- 
ties for  youth  throughout  the  City.  Activities  include  team  sports, 
arts  and  crafts,  dramatics,  gymnastics  and  field  trips. 

In  addition  to  the  activities  sponsored  by  the  operating  di- 
visions, the  GRD  is  responsible  for  maintaining  park  and  picnic 
facilities,  tennis  courts  and  racketball  courts. 

As  shown  in  the  organization  chart  for  the  GRD  (Figure  1) ,  the 
Director  of  Recreation  reports  to  the  City  Manager  (administrative 
head  of  City  government)  who  in  turn  reports  to  the  City  Commission — 
the  elected  representatives  of  Gainesville  residents.     The  City 
Commission  appoints  interested  Gainesville  residents  to  serve  on 
Public  Recreation  Advisory  Board  (PRAB) .     This  board  advises  the 
City  Manager  in  the  area  of  recreation  and  meets  monthly  with  the 
Director  of  Recreation  for  the  purpose  of  discussing  recreation  poli- 
cies, programs  and  problems. 
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Each  month  the  GRD  submits  a  report  of  its  activities  to  the 
City  Manager,  City  Commission  and  the  PRAB.     The  report  describes 
major  recreation  events  occurring  during  the  month  and  provides  in- 
formation on  attendance  at  recreation  facilities. 

The  GRD,  along  with  all  other  City  departments,  prepares  its 
budget  for  submission  to  the  City  Manager's  Office  in  the  spring. 
After  numerous  bargaining  sessions,  a  final  revised  budget  is  pre- 
pared.    It  is  submitted  by  the  City  Manager  to  the  City  Commission. 
After  revisions  desired  by  the  City  Commission  have  been  made,  the 
final  budget  is  approved,  .    .  . 

Recreation  expenditures  are  controlled  by  this  annual  budget 
which  reflects  appropriations  by  division  and  by  object  of  expendi- 
ture.    Program  budgeting  is  not  used.     For  1974-75,  $465,087  was 
appropriated  in  the  budget  for  recreation  expenditures.  Actual 
expenditures  for  1973-74  were  $384,407. 

The  GRD  receives  monthly  financial  reports  which  follow  the 
budget  format  and  depict  (1)  appropriations,   (2)  year-to-date  expend- 
itures and  encumbrances,   (3)  unencumbered  appropriations,  and  (4) 
monthly  expenditures. 

Status  of  Performance  Measurement  in  the  Gainesville  Recreation 
Department 

Informat  ion  obtained  through  interviews  with  supervisors  in 
the  GRD  and  from  an  examination  of  departmental  documents  indicated 
that  a  formal  system  for  evaluating  department  performance  did  not 
exist.     Data  by  program  (cost,  number  of  participants,  number  of 
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staff,  etc,)  was  not  regularly  collected  and  maintained.     The  monthly 
reports   (see  p,  41)  submitted  to  the  City  Manager,  City  Commission 
and  ?RAB  are  unsuited  for  decision  making.     (Based  on  the  extensive 
number  and  size  of  errors  found  in  these  reports  and  on  conversations 
with  GRD  supervisors,  it  became  evident  that  the  reports  were  not 
being  used  for  decision  making.) 

Departmental  objectives  were  extremely  vague.     Objectives  by 
division  and  program  had  not  been  defined. 

In  summary,  performance  measurement  is  at  best  based  on  ad  hoc 
intuitive  evaluations. 

The  preceding  remarks  are  not  intended  as  an  indictment  of 
the  GRl),     Rather  they  are  made  for  the  purpose  of  (1)  indicating 
the  undeveloped  state  of  performance  measurement  and  (2)  illustrating 
the  amount  and  type  of  preliminary  work  which  must  be  done  before 
a  formalized  system  of  performance  measurement  can  be  developed. 
The  situation  in  the  GRD  is  characteristic  of  much  of  the  NFP  area. 

The  model  developed  in  this  research  should  be  evaluated  in 
light  of  the  present  state  of  performance  measurement  in  the  NFP 
area.     It  should  not  be  viewed  as  an  extension  of  an  already  sophis- 
ticated decision  making  process. 

The  Performance-Measurement  Model 
Once  the  researcher  had  become  familiar  with  the  organizational 
structure  and  activities  of  the  GRD,  the  design  of  a  performance 
measurement  model  was  begun.     The  model  proposed  is  the  result  of 
the  researcher's  study  of  performance  measurement  methodologies 
(see  Chapter  II)  and  his  knowledge  of  GRD  activities.     The  model  is 
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illustrated  by  means  of  Figure  2  and  is  discussed  below.     The  con- 
ceptual model   (Part  I  of  Figure  2)  is  presented  first.     The  proced- 
ures taken  to  operationalize  the  conceptual  model  (Part  II  of  Figure 
2)  are  then  presented.     (The  numbers  in  brackets  in  the  following 
discussion  refer  to  the  relevant  facets  of  Figure  2.) 

Conceptual  Model 

For  the  purposes  of  this  research,  the  GRD  is  viewed  as  a 
social  system  whose  purpose  (assumed)  is  to  promote  the  welfare 
of  the  Gainesville  community  thru  the  provision  of  recreation  pro- 
grams and  facilities.     Ideally,  the  amount  and  type  of  public  recre- 
ation provided  would  be  the  amount  and  type  consistent  with  the 
maximum  total  welfare  for  the  Gainesville  community.  Assuming 
knowledge  of  the  social  welfare  function  for  the  community,  production 
functions,  and  cost  functions,  the  conditions  for  maximum  welfare 
can  be  identified  and  the  amount  and  type  of  public  recreation 
required  for  a  welfare  maximum  can  be  determined.    A  mathematical 
example  of  this  process  follows: 

Let 

Pj^  i  =  1 , . .  .  ,  n 

be  recreation  programs  (outputs) 

H  j  =  1,  .  .  .  ,w 

be  inputs 

P^  =  P^  •  ) 

be  production  functions;  H. .  is  the  amount  of  the  j  th 
input  used  in  producing  the"'"!  th.  program 
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W  =  W(P  ,...,P  ,C  ,...,C  ) 

be  the  social  welfare  function;  Tr->0;    As  P.  is  increased 

^^i  ^ 

9W 

welfare  increases;  g^Oj     Increasing       entails  reducing 

i 

the  output  of  some  other  good  and  thereby  the  welfare 
derived  from  its  consumption-^this  welfare  loss  is  the 
"true  cost"  of  producing  one  more  unit  of  P. 


W  can  be  maximized  subject  to  the  production  and  cost  functions. 
Substituting  the  production  functions  for  P^  and  cost  functions  for 
yields    W  =  W[Pp^^....,H^^),...,P^(H^^,...,H^^),C^(H^^,...H^^),... 

C  (H  ,,...H  )] 

n    nl  nw 

Calculate  the  partials  of  W  with  respect  to  H^^  and  set  them  equal 
to  zero:  ^  =  W  +  W..  ^^i  =  0 

The  first  term  of  each  partial  represents  the  gain  in  welfare  from 

the  use  of  1  more  unit  of  H.  in  the  nroduction  of  P..     The  second 

term  represents  the  loss  in  welfare  from  such  usage.     Since  W  <0, 

the  sum  of  the  2  terms  for  a  partial  may  be  interpreted  as  the 

net  welfare  gain  (loss)  from  the  production  of  one  more  unit  of  P^. 

Solving  the  set  of  partial  equations  for  all  H^^s  would  permit  the 

determination  of  all  P.  and  C..     Given  C.,  the  optimal  recreation 

i  1  X'  ^ 

budget  is  \  C^. 


i=l 

As  depicted  in  Figure  2,  Part  I,  the  values,  desires  and  needs 
of  the  members  of  the  Gainesville  community  [1]  enter  into  the  de- 
termination of  the  social  welfare  function  [2].  Through 
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knowledge  of  this  function,  decision  makers   [3]  can  establish  the 
optimal  recreation  budget  [4].     With  this  budget  the  necessary 
program  inputs  [5]  can  be  acquired.     These  inputs  are  transformed 
via  the  production  functions  [6]  into  those  outputs  [7]  consistent 
with  maximum  welfare  [8]. 

Operational  Model 

Limitations  and  plans 

Since  W  =  W(Pt,...,P        , , . . ,C  )  is  unknown,  the  amount  of 
1  n    i  n 

each  P. consistent  with  a  welfare  maximum  cannot  be  determined. 

X 

Furthermore,  while  benefits  expressed  in  dollars  (a  cost-benefit 
approach)  can  sometimes  be  used  as  a  surrogate  for  utility  (Cohn, 
1972),  the  dollar  value  of  the  benefits  of  the  GRD  recreation  pro- 
grams cannot  be  reliably  determined  (see  p.  17).     Thus  at  the  oper- 
ational level,  the  amount  and  type  of  public  recreation  to  be  pro- 
vided is  outside  the  purview  of  the  model  and  must  be  left  to  the 
judgement  of  decision  makers. 

While  the  social  welfare  function  (analogous  to  the  profit 
function  in  the  profit  sector)  appears  beyond  current  knowledge, 
the  other  elements  involved  in  the  welfare  maximization  process  do 
appear  susceptible  to  identification,  measurement  and  analysis. 
In  the  profit  sector,  for  example,  outputs,  inputs,  production 
functions  and  cost  functions  are  an  integral  and  routine  part  of 
management's  information  system.     Such  information  should  be  of 
considerable  value  to  NFP  decision  makers  for  it  would  reduce  some 
of  the  uncertainty  associated  with  the  welfare  maximizing  process 
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(see  p.  43)  and  thereby  render  this  process  both  more  efficient  and 
effective.  ,  " 

The  purpose  of  the  operational  model  for  the  GRD  (Figure  2, 
Part  II)  is  to  develop  information  [18]  about  program  outputs ,  inputs , 
(including  costs)  and  production  for  the  use  of  Gainesville  decision 
miakers  [19].     Such  information  should  be  useful  in  assessing  the 
contribution  of  individual  recreation  programs  to  the  welfare  of 
the  Gainesville  community  and  in  evaluating  the  impact  of  budget 
changes  on  outputs  and  costs. 

Program  identification 

The  first  action  taken  was  to  identify  the  major  recreation 
programs  offered  by  the  GRD.     Once  identified  these  programs  served 
as  the  basis  for  producing  measures  of  input  and  output  and  the  de- 
termination of  relationships  between  inputs  and  outputs.     Each  of 
the  four  operating  division  heads  [14]  was  asked  to  supply  the  re- 
searcher with  a  list  of  major  recreation  programs  [12]  provided  by 
his  division  during  the  fiscal  year  ending  September  3,  1974.*  These 
programs  were  reviewed  and  revised  by  the  Assistant  Director  of  Re- 
creation.    The  revised  list  of  programs  was  then  reviewed  by  the 
division  heads  who  concluded  that  the  programs  on  the  list  were 
indicative  of  major  recreation  activities.     Furthermore  these  pro- 
grams were  acceptable  to  them  for  use  as  the  basic  unit  for  obtaining 


*Since  the  researcher  began  working  with  the  GRD  in  February,  1975, 
fiscal  1974  represented  the  most  recent  and  complete  year  of  oper- 
ations . 
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input  and  output  measures. 

In  total  55  programs  (activities)  were  identified.  These 
programs,  by  division,  are  listed  in  Appendix  A. 

Output  measures 

VJhile  output  measures  for  recreation  programs  have  been  largely 
ignored  by  recreation  administrators  (Hatry  and  Dunn,  1971,  p.  14), 
the  use  and  value  of  measures  of  the  quantity  and  quality  of  recre- 
ation output  have  been  discussed  in  the  literature  (Ilack  and  Myers, 
1965;  Kraus  and  Curtis,  1973;  Hatry  and  Dunn,  1971;  Wennergren  and 
Fullerton,  1972).     In  the  social  service  measurement  model  developed 
for  the  Jex^ish  Community  Federation,  the  output  of  each  type  of 
service  was  defined  in  terms  of  its  quantity ,  quality  and  importance 
(see  p.  29).     Since  a  GRD  recreation  program  is  analogous  to  a  "type 
of  service",  similar  measures  would  appear  to  be  useful  in  defining 
recreation  output. 

The  potential  usefulness  of  measures  of  the  quantity,  quality 
and  importance  of  recreation  programs  was  discussed  with  the  Director 
and  Assistant  Director  of  the  GRD,  division  heads  and  members  of  the 
PRAB.     All  groups  expressed  the  opinion  that  such  measures  would  be 
useful . 

Several  PRAB  members  also  stated  that  information  concerning 
the  adequacy  of  recreation  facilities  would  be  useful.     Therefore  a 
measure  of  facility  adequacy  will  be  included  among  the  output 
measures  (see  p.  64  ). 
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Quantity  of  output 

The  number  of  direct  user-service  hours  provided  by  a  recre- 
ation program  will  be  used  as  a  measure  of  output  quantity  [16]. 
Direct  users  are  primarily  program  participants.     However,  some 
programs  also  involve  spectators  (e.g.,  ball  games,  swim  meets,  etc.) 
User-service  hours  will  be  obtained  by  multiplying  the  number  of 
participants  and  spectators  by  their  respective  average  number  of 
hours  of  usage. 

While  it  may  be  appealing  to  add  participant  and  spectator 
hours  together  for  a  single  measure  of  physical  output,  such  aggre- 
gation would  represent  an  arbitrary  assignment  of  equal  weights  to 
each  type  of  output  quantity.     Therefore  each  type  will  be  reported 
separately  so  that  decision  makers  can  assign  whatever  weights  they 
may  believe  appropriate, 

Because  records  of  the  number  of  program  users  and  hours  of 
usage  are  not  maintained  by  the  GRD,  it  will  be  necessary  to  rely 
on  estimates  provided  by  division  heads  in  the  GRD.     This  use  of 
estimates,  while  necessary,  introduces  the  potential  for  error — 
results  of  analyses  based  on  such  estimates  must  be  interpreted 
cautiously  with  this  limitation  in  mind. 

As  measures  of  output  quantity,  by  themselves,  provide  in- 
sufficient evidence  of  how  well  the  community  is  served  by  a  program, 
measures  of  program  quality  and  program  importance  will  also  be 
obtained. 
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Quality  and  importance  of  output 

Quality.     Program  quality,  as  used  in  this  research,  means 
"how  good  the  program  is."    While  the  determinants  of  program  qual- 
ity have  not  been  identified  and  verified  empirically,  most  of  these 
determinants  are  believed  to  be  under  the  control  of  and  therefore 
the  responsibility  of  the  GRD. 

Importance .     Program  Importance,  as  used  in  this  research, 
means  "how  much  a  program  contributes  to  making  the  Gainesville 
community  a  more  enjoyable  place  to  live."    This  measure  is  presumed 
to  reflect  the  values  and  preferences  of  the  Gainesville  community, 
to  be  fairly  stable  over  time  and  to  be  independent  of  GRD  activi- 
ties.    Measures  of  program  importance  generated  by  the  GRD  can  be 
compared  with  measures  generated  by  the  PRAB  and  the  community  in 
order  to  obtain  evidence  as  to  whether  or  not  the  GRD  is  correctly 
assessing  the  values  of  the  community. 

While  direct  and  objective  measures  of  the  quality  and  im- 
portance of  recreation  programs  do  not  exist,  it  is  possible  to  use 
opinions  of  knowledgeable  people  to  produce  indicants  ElU  of  the 
quality  and  importance  of  recreation  programs  (Helmer,  1966;  Dalkey 
et  al.  ,  1972;  Mantel  et    al.,  1972).     Opinions  for  this  purpose  will 
be  obtained  by  means  of  a  questionnaire  (see  p,  59)  which  will  be 
given  to  randomly  selected  households  [9]  in  the  Gainesville  com- 
munity, all  supervisors  in  the  GRD  [10]  and  PRAB  members  [10]  (see 
p.  57).     The  validity  of  these  indicants  of  quality  and  importance 
will  be  ass'^ssed  [17]  by  use  of  a  multitrait-multirater  methodology 
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(see  p.   66)  and  the  Delphi  technique  (see  p.  71), 
Input  measures 

While  the  measures  of  program  input  pose  no  operational  diffi- 
culty, per  se,  the  absence  of  cost  accounting  and  program  budgeting 
means  that  desired  measures  will  have  to  be  produced  specifically 
for  this  research.     The  operating  division  heads  will  attempt  to 
provide  the  following  data  for  each  of  their  programs  (out  of  the 
55  identified  at  p.  49)  for  the  year  ended  September  30,  1974: 

1.  Direct  labor  cost 

2.  Other  direct  cost 

3.  Number  of  staff  and  staff  hours 

4.  Number  of  volunteers  and  volunteer  hours 

5.  User  fees 

Items  1  and  2  above  refer  to  the  variable  delivery  cost  of 
the  program.     Direct  labor  accounts  for  approximately  70%  of  total 
direct  costs.     No  attempt  will  be  made  to  obtain  information  on 
fixed  costs — these  costs  are  not  readily  available  and  their  allo- 
cation to  individual  programs  would  be  arbitrary. 

Items  3  and  4  refer  to  the  labor  (paid  and  unpaid)  used  di- 
rectly in  providing  a  program.    ^Jhile  data  on  other  inputs  would 
also  be  desirable,  it  was  not  available. 

While  many  programs  are  offered  free  of  charge,  some  require 
payment  of  entry,  usage  or  materials  fees  (item  5  above).     Such  fees 
represent  a  reduction  in  the  cost  of  the  program  to  the  community 
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at  large. 

The  absence  of  formal  information  systems  for  the  routine 
collection  of  these  input  measures  means  that  estimates  will  again 
have  to  be  used  for  certain  items  in  certain  programs,  Therefore 
the  caveat  issued  in  connection  with  user-!^service  hours  (see  p,  50) 
is  applicable  here. 

Input-output  relationships 

Once  the  input  and  output  measures  have  been  obtained,  re-:> 
gression  analysis  (see  p.  75)  will  be  used  in  an  attempt  to  identify 
relationships  between  labor  inputs  and  the  quantity  and  quality  of 
output*  [15].     While  such  relationships  are  analogous  to  production 
functions,  the  input  and  output  measures  to  be  used  refer  to  differ^^ 
ent  programs  instead  of  different  levels  of  input  and  output  for 
the  same  program. 

In  its  production  of  a  particular  recreation  program,  it  appears 
reasonable  to  assume  that  the  GRD  intends  to  produce  a  certain  quantity 
and  quality  of  output.     Furthermore,  it  is  probably  possible  to 
tradeoff  quantity  and  quality  in  the  production  of  output  with  the 
result  that  a  given  output  can  be  produced  with  various  combinations 
of  quality  and  quantity.     This  tradeoff  is  depicted  graphically  in 
Figure  3. 


*Since  the  GRD  has  no  control  over  the  importance  of  programs, 
measures  of  importance  are  excluded  from  this  analysis.     Since  the 
labor  input  measures  to  be  collected  relate  only  to  participant 
hours,  spectator  hours  are  excluded  from  user  hours  for  this  analysis. 


54 


Quality 


Quantity 

Figure  3.     Tradeoff  between  quantity  and  quality  in  the 
production  of  recreation  output. 


If  the  nature  of  the  quantity-quality  interaction  were  known,  it 
would  be  possible  to  aggregate  the  quality  and  quantity  measures 
produced  in  this  research  into  a  scalar  index  of  output  which  could 
then  be  related  to  inputs  for  the  purpose  of  identifying  recreation 
production  functions.     Since  the  nature  of  this  interaction  is  not 
known,  the  relationships  to  be  examined  will  be  limited  to  the 
following: 

Gl.     U  =U(L) 

G2.  =  U^(L)     ;  Q  =  Q^;  e  =  1,  2,  3 


G3.     Q  =  Q(_) 
U 


where 


U  =  user  hours 
L  =  labor  hours 

Q  =  quality 

e 

Q  =  low,  average  and  high  quality  respectively 
For  Gl  the  relationship  between  output  quantity  and  labor  inputs 
is  to  be  examined  for  all  programs  without  regard  to  program  quality. 
For  G2  the  observed  program  quality  range  will  be  divided  into  3 
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categories  (low,  average  and  high  quality)  and  the  relationship  be- 
tween output  quantity  and  labor  inputs  for  the  programs  in  each  qual- 
ity category  will  be  examined.     For  both  Gl  and  G2  it  is  expected 

that  —  >  0.     For  G3  the  relationship  between  quality  and  the  labor- 
dL 

user  ratio  will  be  examined.     It  appears  reasonable  to  assume  that 
0. 

d(|> 

If  the  relationships  for  Gl,  G2  and  G3  can  be  described  suc- 
cessfully, decision  makers  should  be  better  able  to  predict  the  effects 
of  input  changes  on  program  quantity  and  quality.     Furthermore,  know- 
ledge of  G2  may  provide  insight  into  the  nature  of  the  quantity-qual- 
ity tradeoff  in  the  production  of  program  output. 

Simple  linear  models  will  first  be  applied  to  Gl,  G2  and  G3. 
If  such  models  are  found  to  be  inadequate,  more  complicated  models 
will  be  utilized. 

Uses  of  the  Performance  Measurement  Model 
While  the  operational  performance  measurement  model  gener- 
ates quantitative  information  which  can  be  used  for  many  different 
purposes  at  different  levels  of  decision  making,  the  most  important 
uses  appear  to  be  the  following: 

1.  an  assessment  of  the  degree  to  which  the  GRD  and  PRAB  cor- 
rectly perceive  the  values,  desires  and  needs  of  the  Gainesville  com- 
munity. A  prerequisite  for  optimal  performance  (from  the  community's 
point  of  view)  would  appear  to  be  agreement  on  program  importance  be- 
tween decision  makers  and  community  members.     Furthermore,  unless 
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there  is  agreement  between  decision  makers  and  community  members  about 
which  recreation  facilities  are  adequate  or  inadequate,  capital  bud- 
geting decisions  will  probably  be  unsound. 

2.  a  comparison  of  recreation  programs  on  the  basis  of  (1) 
physical  output,   (2)  quality  of  output,   (3)  importance  of  output,  and 
(4)  cost  of  output.     These  comparisons  can  be  made  between  different 
programs  within  a  particular  year  or  over  a  number  of  years  and  be- 
tween the  same  program  over  a  number  of  years.     These  comparisons  should 
improve  resource  allocation  decisions  and  will  provide  direction  for 
corrective  action. 

3.  an  evaluation,  based  on  quantitative  information  (as  noted 
in  2  above),  of  changes  in  GRD  performance  over  time. 


CHAPTER  IV 
METHODOLOGY 
Introduction 

The  model  and  the  procedures  involved  in  operationalizing  it  were 
set  forth  in  the  preceding  chapter.     In  this  chapter  certain  facets 
of  the  model  and  the  methodologies  used  in  its  implementation  will 
be  discussed  in  more  detail.     To  be  discussed  are  (1)  the  rater  groups 
used  to  produce  the  indicants  of  program  importance  and  quality  and 
facility  adequacy;   (2)  the  development  of  the  questionnaire  used  to 
obtain  opinions  from  the  rater  groups;   (3)  the  multitrait-multimethod 
methodology;   (4)  the  Delphi  technique;  and  (5)  regression  analysis. 

Rater  Groups 

Once  the  decision  to  use  opinions  to  produce  indicants  of  pro- 
gram quality  and  importance  had  been  made,  it  was  necessary  to  decide 
whose  opinions  were  to  be  used.     Concern  (1)  for  assessing  the  validity 
of  indicants  produced,   (2)  a  desire  to  determine  how  well  the  decision 
makers  (GRD  and  PRAB)  reflected  the  views  of  the  community  they  served 
and  (3)  a  desire  to  obtain  citizen  input  resulted  in  the  choice  of  the 
following  3  independent  groups:     GRD  supervisors;  PRAB  members;  com- 
munity members.     These  groups  will  now  be  discussed. 

GRD  Supervisors 

All  14  supervisors  in  the  GRD  agreed  to  participate.     Since  these 
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supervisors  are  primarily  responsible  for  initiating,  maintaining  and 
eliminating  recreation  programs,  they  represent  a  very  knowledgeable 
group  in  regard  to  recreation  programs.     Because  there  is  a  large  amount 
of  interaction  and  cooperation  among  the  supervisors,  their  knowledge 
is  not  limited  to  those  programs  for  which  they  are  directly  responsible. 

While  their  knowledge  of  recreation  programs  militates  in  favor 
of  valid  appraisals,  their  closeness  to  recreation  activities  and  their 
vested  interest  in  the  GRD's  performance  could  inject  bias  into  their 
opinions  (consciously  and  unconsciously) . 

FRAB  Members  ' 

The  members*  of  the  PMB  also  agreed  to  participate.  Appointed 
to  the  PRAB  because  of  their  interest  in  and  knowledge  of  recreation, 
board  members,  like  the  GRD  supervisors,  constitute  a  group  of  recreation 
experts.     However,  board  members  are  independent  of  the  GRD  and  have 
no  direct  vested  interest  in  the  GRD's  performance. 

The  opinions  of  the  GRD  supervisors  and  PRAB  members  will  be 
obtained  by  formal  questionnaire  within  a  Delphi  (see  p.   71)  framework. 

Members  of  the  Gainesville  Community 

The  members  of  the  Gainesville  community  are  the  primary  users 
and  beneficiaries  of  recreation  programs  and  facilities.     Through  their 
tax  dollars  they  bear  most  of  the  costs  of  providing  these  programs 
and  facilities.     Therefore,  they  represent  a  valuable  source  of  information 


*A11  9  official  members  and  2  ex-officio  members  participated. 
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in  regard  to  the  importance  and  quality  of  recreation  programs  and 
the  adequacy  of  recreation  facilities. 

Ideally,  the  opinions  of  both  program  participants  and  a  random 
sample  of  members  of  the  Gainesville  community  should  have  been  obtained. 
Unfortunately,  for  most  recreation  programs,  participants  were  not 
known  and  therefore  direct  contact  with  them  was  impossible.  However, 
this  limitation  could  be  somewhat  overcome  by  developing  a  survey  in- 
strument which  distinguished  between  participants  and  non-participants. 

Primarily  because  of  resource  constraints  the  opinions  of  a  random 
sample  of  community  members  will  be  selected  by  mail  survey. 

While  an  up-to-date  list  of  individual  members  of  the  Gainesville 
community  was  not  available,  a  list  of  households  was  available  in 
the  form  of  the  City's  Utility  Billing  Listing.     (This  listing  contains 
the  name  and  address  of  all  the  customers  of  City  electric,  water  and 
sewer  service) .     Since  the  City  is  the  sole  source  of  electricity  to 
the  community,  this  list  encompasses  most  of  the  households  and  there- 
fore most  of  the  members  of  the  community.*    Furthermore,  the  list  is 
current  and  accurate. 

From  this  Utility  Billing  Listing  of  approximately  33,000  res- 
idental  accounts  2,000  were  selected  at  random  for  the  mail  survey. 

Questionnaire  Development 
The  process  of  designing  and  testing  a  questionnaire  for  use 


*The  billing  listing  obviously  does  not  encompass  all  members  of  the 
community  and  therefore  the  information  obtained  through  its  use  will 
be  somewhat  biased.     This  is  a  limitation  of  the  research. 
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in  obtaining  the  opinions  of  the  rater  groups  proved  to  be  quite  ex- 
tensive.    The  stages  of  questionnaire  development  will  now  be  discussed. 

The  initial  questionnaire  designed  by  the  researcher  was  reviewed 
by  the  members  of  the  dissertation  committee,  GRD  supervisors  and  col- 
leagues of  the  researcher.    After  necessary  revisions  were  made,  it 
was  presented  to  PRAB  members  at  their  monthly  meeting  with  the  Director 
of  Recreation.    After  incorporating  certain  changes  suggested  by  PRAB 
members,  the  questionnaire  and  a  questionnaire  evaluation  form  were 
given  to  all  34  employees  of  the  GRD  for  completion  and  evaluation. 
Based  on  the  results  of  this  pretest,  the  questionnaire  was  revised 
for  the  final  time.     (See  Appendix  B  for  the  final  questionnaire.) 

Because  control  over  responses  of  households  selected  for  the 
mail  survey  was  limited  to  the  questionnaire  itself,  the  questionnaire 
was  designed  specifically  for  this  target  group.     The  questionnaire 
so  produced  was  easily  modified  for  use  with  the  GRD  and  PRAB. 

In  final  form  the  questionnaire  (Appendix  B)  used  in  the  mail 

survey  consisted  of  a  cover  letter  and  5  parts: 

Part  1  General  Information 

Part  2  Importance  of  Programs 

Part  3  Quality  of  Programs 

Part  4  Participation  in  Programs 

Part  5  Adequacy  of  Facilities  and  Programs 

The  most  important  facets  of  this  questionnaire  will  now  be  discussed. 

Cover  Letter 

The  cover  letter  contains 

1.     an  appeal  from  the  Director  of  the  GRD  for  the  cooperation 
of  community  members.     This  appeal  directly  associates  the  questionnaire 
with  the  GRD  and  should  increase  the  response  rate. 


61 

2.  operational  definitions  of  importance  and  quality:     "By  im- 
portance, we  mean  how  much  a  program  contributes  to  making  the  Gainesvill 
community  a  more  enjoyable  place  to  live";  "By  quality  we  mean  how  good 
the  program  is."    These  definitions  should  help  insure  that  respondents 
are  replying  to  the  same  questions. 

3.  instructions  for  completing  the  questionnaire. 

Socio-Economic  Data 

In  order  to  encourage  community  members  to  express  their  opinions, 
their  anonymity  was  guaranteed.     Such  anonjmiity,  however,  precluded 
a  followup  of  non-responses  and  thereby  introduced  the  possibility 
of  self-selection  bias — the  opinions  of  those  returning  the  questionnaire 
may  not  be  representative  of  those  not  returning  the  questionnaire. 
This  possibility  imposes  limitations  on  the  conclusions  which  can  be 
drawn  from  the  results  of  the  survey. 

In  an  effort  to  identify  the  existence  of  self-selection  biases, 
the  following  socio-economic  information  was  requested  of  respondents: 
sex;  age;  education;  marital  status;  number  living  in  home;  family  in- 
come; college  status;  and  reside  inside  or  outside  City  limits.  This 
survey  information  will  be  compared  with  that  for  the  entire  community 
(based  on  1970  census  tract  data)  in  an  effort  to  assess  the  represent- 
ativeness of  the  survey. 

Indicants  of  Program  Importance  and  Quality 

The  approach*  followed  in  generating  indicants  of  program  Im- 


*This  approach  is  based  on  that  used  for  the  JCF  (see  p.  30). 
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portance  and  quality  will  now  be  discussed. 

Importance.     Each  group  member  was  asked  to  select  from  5  cat- 
egories of  importance  the  1  category  which  best  represented  his  opinion 
of  the  importance  of  a  recreation  program  to  the  Gainesville  community. 
The  categories  of  importance  and  their  numerical  ratings  are 


Very  low  1 

Low  2 

Average  3 

High  4 

Very  high  5 


Quality.     Each  group  member  was  asked  to  select  from  5  categories 
of  quality  the  1  category  which  best  represented  his  opinion  of  the 
quality  of  the  program.     The  categories  of  quality  and  their  numerical 
ratings  are 


Very  poor  1 

Poor  2 

Fair  3 

Good  4 

Very  good  5 


Rather  than  force  a  respondent  to  express  an  opinion  of  the  quality  of 
a  program  with  which  he  was  not  familiar,  the  respondent  was  permitted 
to  select  "no  opinion"  as  his  response. 

So  that  the  verbal  categories  can  be  described  statistically, 
numerals  from  1  to  5  will  be  assigned  to  the  categories.     While  tech- 
nically speaking  the  measures  so  produced  are  ordinal,  they  will  be 
treated  as  approximately  an  interval  scale.     Such  treatment  is  common 
in  social  science  (Kerlinger,  1973,  pp.  439-441).     The  mean  of  the 
numerical  values  assigned  to  the  opinions  of  group  members  will  con- 
stitute the  indicants  of  program  importance  and  quality. 
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Questionnaire  Length  and  Program  Selection 

The  rate  of  response  to  mail  questionnaires  is  normally  inversely 
related  to  the  length  of  the  questionnaire  (Parten,  1950,  p.  385). 
Furthermore,  long  questionnaires  are  likely  to  be  poorly  filled  out 
(Ibid.). 

Therefore,  the  desideratum  of  as  large  and  complete  a  response 
as  possible  appeared  inconsistent  with  the  inclusion  of  all  55  recreation 
programs  in  the  questionnaire.     (The  pre-test  questionnaire,  which 
contained  only  15  programs,  required  between  15  and  20  minutes  for 
completion.)     The  procedure  finally  decided  upon  was  to  select  at  ran- 
dom 24  of  the  55  programs  and  to  then  assign  at  random  these  24  pro- 
grams among  2  questionnaires  (A  and  B)  which  were  identical  except 
for  programs.     Questionnaires  A  and  B  will  be  alternated  among  the 
randomly  selected  sample  of  2,000  households  (ABABAB...). 

This  method  of  alternately  assigning  questionnaires  A  and  B  to 
the  households  should  produce  comparable  response  groups .     If  the  2 
groups  are  comparable,  their  response  should  be  similar. 

In  order  to  determine  the  similarity  of  responses  to  question- 
naires A  and  B,  4  of  the  same  programs  were  included  on  each  question- 
naire.    Indicants  of  program  importance  and  quality  for  these  4  programs 

can  be  tested  statistically  for  agreement  between  the  2  groups.  The 
2 

chi-square  (X  )  test  will  be  used  for  this  purpose. 

Order  of  Programs  on  Questionnaire 

The  order  of  recreation  programs  in  questionnaire  parts  2,  3, 
and  4  was  randomly  determined  for  each  part.     Thus,  while  programs  are 
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the  same  across  the  3  parts,  their  order  differs.  This  procedure  was 
followed  in  order  to  discourage  conscious  coordination  by  respondents 
of  their  responses  across  the  3  parts. 

Participation  in  Programs 

In  questionnaire  Part  4  respondents  were  asked  to  indicate  the 
frequency  of  participation  (by  the  respondent  and  his  family)  in  the 
recreation  programs.     This  information  was  solicited  in  order  to 

1.  determine  if  opinions  of  program  importance  and  quality  are 

related  to  the  frequency  of  participation  in  a  program.     These  relation- 

2 

ships  will  be  analyzed  by  means  of  the  X  test. 

2.  assess  the  validity  of  opinions  of  questionnaire  respondents. 
It  appears  reasonable  to  assume  that  for  most  respondents  knowledge 

of  program  quality  comes  from  participation  in  the  program.     If  this 
assumption  is  correct,  "no  opinion"  responses  for  program  quality  should 
be  associated  with  the  participation  category  "never  participate." 
The  absence  of  such  an  association  would  appear  to  suggest  that  program 
quality  was  evaluated  in  a  capricious  manner  and  that  therefore  the 
opinions  obtained  are  unreliable. 

Adequacy  of  Recreation  Facilities 

During  an  examination  of  the  initial  questionnaire  by  the  PRAB, 
several  board  members  raised  questions  concerning  the  adequacy  of  re- 
creation facilities  and  the  impact  of  such  facilities  on  the  quality 
of  recreation  programs.     This  resulted  in  the  addition  to  the  question- 
naire of  Part  5  where  respondents  are  requested  to  express  their  opinion 
as  to  the  adequacy  or  inadequacy  of  recreation  facilities. 
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Adequacy  will  be  assigned  the  numeral  1  and  inadequacy  the  numeral 

2.     Indicants  of  facility  adequacy  will  be  produced  and  the  association 

between  facility  adequacy  and  the  quality  of  certain  programs  will 

2 

be  examined  by  use  of  X  . 

The  agreement  of  the  adequacy  ratings  produced  by  the  3  groups 
will  be  examined  by  use  of  the  Pearson  product-moment  correlation  co- 
efficient.   Within  the  community  group,  the  existence  of  significant 
correlations  between  respondents  A  and  B  would  provide  additional  evi- 
dence for  the  existence  of  comparable  response  groups  (see  p.  63). 

Geographic  Location  of  Respondent 

An  identification  of  the  general  geographic  location  of  house- 
hold respondents  was  viewed  as  desirable  because 

1.  recreation  values,  desires,  needs  and  experiences  may  vary 
from  location  to  location  within  the  community 

2.  certain  socio-economic  characteristics  can  be  identified  with 
certain  geographic  locations  (based  on  1970  census  tract  data)  and 
therefore  the  geographic  location  of  respondents  will  be  helpful  in 
identifying  response  biases. 

The  format  of  the  Utility  Billing  Listing  suggested  the  means 
by  which  the  identification  of  the  general  geographic  location  of  re- 
spondents could  be  made.     Customer  accounts  (households)  are  grouped 
into  6  billing  cycles  each  of  which  can  be  identified  with  a  particular 
geographic  location  in  the  Gainesville  community: 

Cycle  1  Predominantly  southeast  with  some  northeast 

Cycle  2  Northeast 
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Cycle  3  Predominantly  southwest 

Cycle  4  Northwest 

Cycle  5  Predominantly  southwest  with  some  northwest 

Cycle  6  Northwest 
By  color-coding  the  questionnaires  and  sending  questionnaires  of  the 
same  color  to  the  accounts  within  a  cycle,  an  identification  of  the 
general  geographic  location  of  respondents  can  be  made.    This  will 
permit  (1)  the  generation  of  indicants  of  importance,  quality  and  ad- 
equacy by  geographic  location  and  (2)  the  calculation  of  response  per- 
centages by  geographic  location.  •  , 

Modification  of  Survey  Questionnaire  for 
Use  with  CRD  and  PRAB 

The  survey  questionnaire  required  little  modification  for  use 
in  obtaining  the  opinions  of  the  GRD  and  PRAB.     The  modifications  were 

1.  elimination  of  the  general  information  section  (Parti). 
This  information  was  not  needed. 

2.  increase  in  number  of  programs  evaluated  (from  14  to  55). 

3.  addition  of  columns  for  feedback  information  (for  Delphi). 
The  same  questionnaire  (Appendix  E)  was  used  for  GRD  and  PRAB  members. 
The  cover  letters  (Appendix  F)  differed  only  slightly  from  one  another. 

Multitrait-Multimethod  Methodology 
Indicants  of  program  importance  and  quality  will  be  produced 
from  the  opinions  obtained  from  the  3  independent  groups.     The  multi- 
trait-multimethod  matrix,  developed  by  Campbell  and  Fiske  (1959),  will 
be  used  to  assess  the  validity  of  these  indicants.     This  methodology 
has  come  to  be  recognized  as  a  powerful  tool  for  ascertaining  the  valid 
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of  constructs  (e.g.,  Kerlinger,  1973,  pp.  464-466;  Runkel  and  McGrath. 
1972,  pp.  163-167). 

Concerning  this  method  Campbell  and  Fiske  state  that 

1.  Validation  is  typically  convergent ,  a  confirmation  by 
independent  measurement  procedures.     Independence  of  methods 
is  a  common  denominator  among  the  major  types  of  validity 
(excepting  content  validity)  insofar  as  they  are  to  be  dis- 
tinguished from  reliability. 

2.  For  the  justification  of  novel  trait  measures,  for  the 
validation  of  test  interpretation,  or  for  the  establishment 
of  construct  validity,  discriminant  validation  as  well  as 
convergent  validation  is  required.     Tests  can  be  invalidated 
by  too  high  correlations  with  other  tests  from  which  they 
were  intended  to  differ. 

3.  Each  test  or  task  employed  for  measurement  purposes  is 
a  trait-method  unit ,  a  union  of  a  particular  trait  content 
with  measurement  procedures  not  specific  to  that  content. 
The  systematic  variance  among  test  scores  can  be  due  to  re- 
sponses to  the  measurement  features  as  well  as  responses 

to  the  trait  content. 

4.  In  order  to  examine  discriminant  validity,  and  in  order 
to  estimate  the  relative  contributions  of  trait  and  method 
variance,  more  than  one  trait  as  well  as  more  than  one  method 
must  be  employed  in  the  validation  process.     In  many 
instances  it  will  be  convenient  to  achieve  this  through  a 
multitrait-multimethod  matrix.     (1959,  p.  81) 

To  illustrate  their  validation  process  they  present  a  synthetic  multi 

trait-multimethod  matrix,     (This  matrix  is  included  here  as    Table  1. 

In  terms  of  this  matrix,  Campbell  and  Fiske  state  4  criteria  which 

bear  on  the  question  of  validity: 

1,  ...the  entries  in  the  validity  diagonal  should  be  sig- 
nificantly different  from  zero  and  sufficiently  large  to 
encourage  further  examination  of  validity.... 

2.  ...a  validity  diagonal  value  should  be  higher  than  the 
values  lying  in  the  column  and  row  in  the  heterotrait-hetero- 
method  triangles,,,. 
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3.  ...a  variable  [should]  correlate  higher  with  an  inde- 
pendent effort  to  measure  the  same  trait  than  with  measures 
designed  to  get  at  different  traits  which  happen  to  employ 
the  same  method.... 

4.  ...the  same  pattern  of  trait  interrelationship  be  shown 
in  all  of  the  heterotrait  triangles  of  both  the  monomethod 
and  heteromethod  blocks.     (1959,  pp.  82-83) 

The  first  criterion  provides  evidence  of  convergent  validity;  the  next 

3  criteria  provide  evidence  of  discriminant  validity. 

While  Campbell  and  Fiske  present  their  methodology  in  terms  of 
independent  methods,  it  is  evident  from  the  examples  they  discuss  that 
independent  raters  can  be  viewed  as  different  methods  (1959,  pp.  89-97), 
The  use  and  application  of  a  multitrait-multirater  approach  for  mea^ 
suring  managerial  performance  has  been  discussed  by  Lawler  who  con- 
cludes that  "this  approach  has  advantages  for  establishing  criteria 
where  they  are  needed,  either  for  research  purposes  or  for  personnel 
decision-making  purposes"  (1967,  p.  369). 

As  employed  in  this  research  the  3  independent  rater  groups 
will  constitute  the  different  methods  and  indicants  of  quality  and 
importance  of  the  traits.     This  combination  of  raters  and  indicants 
will  be  hereafter  referred  to  as  the  multivariable-multirater  matrix. 

An  illustration  of  the  multivariable-multirater  matrix  to 
be  used  in  this  research  is  presented  in  Table  2.  Correlation 
between  all  3  groups  for  the  24  programs  commonly  evaluated 
will  be  obtained.     Correlation  between  the  CRD  and  PRAB  for  all  55 
programs  will  be  obtained.     Thus  1  matrix  will  involve  3  rater 
groups  and  2  variables;  another,  2  rater  groups  and  2  variables. 


70 


H 

I 

Pi 
W 
H 

CM  l-l 
H 

I 

W 


w 

H 


> 
H 
H 


O  >, 
4-1 

g  § 

•H  O 


s 


13 

to 

o 

PQ 


•HOB 

C  CO  a> 

•H  iH  S 

a.  > 
o  -a 

<i1 


o 


to 

o  o 

•H  to 
td 

0)  U 

U  (U 

O  Du 

a.  o  3 

O  erf  CO 


e  H 

U  -H 

o  td 

PU  o- 


>J  4-1 

00  ft 

o  e 

M  M 


B  H 

td  4-> 

H  -H 

60  rH 

O  td 

u  n 

PM  o- 


6 

td  . 

U  4J 

60  & 

o  W 

U  M 


M  -rl 

60  iH 
O  td 
U  3 


%  . 

M  4J 

60  & 

o  B^ 

U  H 


Pi 


Pi 


B 

td  • 

H  4J 

60  CX 

o  B 

U  M 


^  -H 

60  .H 
O  td 
U  3 
P4  O* 


C 
o 


to 
u 
o 
tn 


td 

Q) 

u 

•H  o  & 

ft  0)  3 
O  PS  CO 


> 
u 

a) 


PS 


PS 


B 

td  . 

u  u 

60  ft 

o  B 


td  4-1 

U  -H 

60  rH 
O  td 
U  3 
pL.  O" 


tn 
C 
O 


o 

tn  T)  CD 

•H  >  td 

ft  13  O 

o  <;  M 


0) 


PS 


pS 


6 

td  . 

U  4J 
60  ft 

0  B 

^4  M 


O 


B 

td  4-1 

U  -H 

60  iH 

o  td 

U  3 

PL.  C 


m 
C 
o 

•H 

c 

•H  O  0) 
ft  M  to 
O  Ph 


V4  CD 
60  U 


m 
tu 

4J 

O 

2: 


I 

(U  CO 

•H  4-1 

U  C9 

cd  i-i 


>  _ 

o  n 

V4  U 

tu  o 

4J  O 
0) 


I 
I 


c 

lance 

latlo 

u 

0) 

td 

M 

> 

c 

M 

•rf 

(U 

4-> 

05 

td 

c 

o 

1 

•iH 

4J 

la 

<u 

^1 

4J 

•H 

o 

CTv 

•a 

u 

rH 

•ri 

rH 

W 

td 

> 

4-1 

• 

td 

n 

c 

4-1 

O 

60 

•o 

V4 

•H 

<1) 

rH 

td 

g 

> 

o 

•H 

o 

4J 

4-1 

rH 

1 

to 

1 

rH 

u 

(U 

u 

1 

u 

t 

>> 

4J 

CO 

rl 

B 

•H 

o 

o 

rH 

V4 

CO 

^4 

iH 

tu 

•H 

-§ 

at 

•X) 

•o 

•H 

4J 

rH 

o 

O 

U 

(U 

M 

(U 

& 

M 

(U  rH 

td 

1 

4J 

t*J 

tu 

tu 

1-1 

71 


The  opinions  of  the  GRD  and  PRAB  will  he  obtained  several 
times  and  the  correlation  between  ratings  based  on  opinions  expressed 
at  different  times  will  provide  measures  of  reliability.     Since  opin- 
ions of  community  members  will  be  obtained  but  once,  it  will  not 
be  possible  to  assess  the  reliability  of  their  ratings  in  a  tra-  . 
ditional  sense.     However,  some  insight  into  the  stability  of  their 
ratings  can  be  obtained  from  the  correlation  between  common  program 
ratings  of  the  A  and  B  groups. 

The  Delphi  Technique 
In  an  effort  to  produce  valid  and  reliable  indicants  of  im- 
portance and  quality,  the  Delphi  technique  will  be  used  in  obtaining 
the  opinions  of  GRD  and  PRAB  members .     The  technique  has  been  well 
described  by  Dalkey: 

The  Delphi    technique,  is  a  method  of  eliciting  and  refining 
group  judgements.     The  rationale  for  the  procedures  is  pri- 
marily the  age  old  adage  "Two  heads  are  better  than  one," 
when  the  issue  is  one  where  exact  knowledge  is  not  available. 
The  procedures  have  three  features:     (1)  Anonymous  response- 
opinions  of  members  of  the  group  are  obtained  by  formal 
questionnaire.     (2)  Iteration  and  controlled  feedback- 
interaction  is  effected  by  a  systematic  exercise  conducted 
in  several  iterations,  with  carefully  controlled  feedback 
between  rounds.     (3)     Statistical  group  response-the  group 
opinion  is  defined  as  an  appropriate  aggregate  of  individual 
opinions  on  the  final  round.     These  features  are  designed 
to  minimize  the  biasing  effects  of  dominant  individuals, 
of  irrelevant  communications,  and  of  group  pressure  toward 
conformity.     (1969,  p.  v) 

The  technique  was  developed  by  the  Rand  Corporation  for  the 
purpose  of  forecasting  future  events  (Helmer  and  Gordon,  1964). 
Experiments  with  almanac  type  data  (Dalkey,  1969)  and  with  short- 
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term  forecasts  (Campbell,  1966)  have  suggested  that  responses  gen- 
erated by  a  Delphi  process  are  superior  to  those  produced  by  individuals 
or  committees. 

The  extension  of  Delphi  to  the  area  of  value  judgements  is  quite 

recent.     In  this  regard  Dalkey  et  al.  state 

Most  of  the  experiments  vhich  have  been  conducted  to  date 
have  dealt  with  factual  material.     However,  in  some  ap- 
plications, the  procedures  have  been  employed  to  deal 
with  a  quite  different  sort  of  material,  namely,  value 
judgements.     Typical  is  the  use  of  Delphi  procedures  to 
identify  and  rate  the  objectives  of  industrial  enterprises 
or  to  assess  the  relative  importance  of  military  missions. 
From  the  standpoint  of  the  decision  maker,  opinions  about 
values  and  objectives  are  just  as  relevant  to  decisions 
as  factual  opinions  about  consequences.     Hence,  the  question 
whether  Delphi  procedures  demonstrate  advantages  with  value 
material  of  the  same  sort  as  those  for  factual  material 
is  a  question  of  direct  importance.     (1972,  p.  55) 

In  considering  the  logic  of  the  extension  of  Delphi  to  value  judge- 
ments, the  authors  point  out  that  "...if  a  group  of  equally  competent 
individuals  expresses  a  range  of  opinions  concerning  a  value  question, 
then  the  average  opinion  is  more  likely  to  approximate  the  correct 
answer  then  an  individual  judgement,  given  the  presumption  that  there 
is  a  correct  answer  to  the  value  question"  (1972,  p.  56).     Since  there 
is  no  presently  known  way  of  assessing  the  excellence  of  value  judge- 
ments, Dalkey  et  al .  present  3  conditions,  found  to  exist  in  Delphi 
experiments  with  factual  data,  as  a  means  of  partially  evaluating 
the  usefulness  of  the  Delphi  technique  in  producing  a  group  judgement 
on  a  value  question: 

1.     Reasonable  distribution.     If  the  distribution  of  group 
responses  on  a  given  numerical  value  judgement  is  flat, 
indicating  group  indifference,  or  if  it  is  U-shaped,  in- 
dicating either  that  the  question  is  being  interpreted 
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differently  by  two  subgroups,  or  there  is  an  actual  dif- 
ference of  assessment  by  two  subgroups,   then  it  seems 
inappropriate  to  assert  that  the  group  considered  as  a 
unit  has  a  judgement  on  that  question. 

2.  Group  reliability.     Given  two  similar  groups  (e.g., 
two  groups  selected  out  of  a  larger  group  at  random)  the 
group  judgements  on  a  given  value  question  should  be  sim- 
ilar.    Over  a  set  of  such  value  judgements,  the  corre- 
lation for  the  two  subgroups  should  be  high. 

3.  Change  and  convergence  on  iteration  with  feedback. 
This  condition  is  proposed  in  part  by  analogy  with  results 
from  experiments  with  factual  material,  that  is  shifts  of 
individual  responses  toward  the  group  response  and  reduction 
in  group  variability.     More  generally,  if  members  of  the 
group  do  not  utilize  information  in  reports  of  the  group 
response  on  earlier  rounds  when  generating  responses  on 
later  rounds,  it  seems  inappropriate  to  consider  these 
responses  as  judgements.     (1972,  p.  57) 

These  criteria  were  used  by  the  authors  to  evaluate  judgements  generated 
in  a  series  of  experiments  dealing  with  the  objectives  of  higher  ed- 
ucation and  of  everyday  life  and  with  the  relative  importance  of  these 
objectives.     Concerning  the  outcome  of  these  experiments  Dalkey  et 
al .  state  that  "the  results  of  applying  the  three  criteria...  to  the 
ratings  of  the  educational  and  quality  of  life  factors  are  all  favorable 
to  the  hypothesis  that  Delphi  procedures  are  appropriate  for  formulating 
group  value  judgements"  (1972,  p.  80).     Subsequent  to  these  experiments, 
Dalkey  et  al.  employed  Delphi  to  develop  and  test  an  index  of  the  "quality 
of  life"  (1972,  pp.  109-129). 

Mantel  et  al.   (1972)  used  a  modified  verson  of  the  Delphi  tech- 
nique to  develop  measures  of  the  value  and  quality  of  services  provided 
by  the  JCF  (see  p.  30,  Chapter  II).     The  researchers  were  pleased  with 
the  results  achieved;  furthermore,  the  measures  of  quality  and  value 
generated  by  Delphi  were  acceptable  to  JCF  administrators. 
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The  criteria  proposed  by  Dalkey  et  al.  and  the  degree  of  con- 
sensus among  the  members  of  each  group  will  be  considered  in  the  eval- 
uation of  the  existence  of  a  group  judgement  as  to  recreation  program 
importance  and  quality. 

A  decision  to  use  Delphi  entails  consideration  of  the  type  of 
feedback  information  and  the  number  of  iterations.  The  2  most  com- 
mon forms  of  feedback  which  have  been  used  in  Delphi  exercises  are 

1-     Statistical  feedback — generally  the  median  group  response  and 
the  interquartile  range  from  the  previous  round.    Virtually  all  Delphi 
studies  have  used  this  type  of  feedback. 

2.     Verbal  feedback — this  generally  entails  justification  of 
extreme  responses  (those  outside  the  interquartile  range).     In  some 
Delphi  experiments,  participants,  whose  responses  on  the  second  round 
are  outside  the  first  round's  interquartile  range,  are  asked  to  justify 
their  (relatively)  extreme  position.     These  justifications  are  then 
fedback  (along  with  statistical  feedback)  to  all  respondents  in  round 
3,     Round  3  respondents  are  permitted  to  provide  counterarguments  to 
the  justifications  for  extreme  positions.     The  counterarguments  are 
fedback  on  round  4. 

To  date,  the  evidence  is  inconclusive  as  to  which  type  of  feedback 
is  most  appropriate.    However,  verbal  type  feedback  poses  considerable 
difficulties  unless  the  number  of  participants  and  questions  is  extremely 
small : 

If  one  Includes  all  comments  from  all  participants,  the 
volume  of  feedback  rapidly  becomes  prohibitive  and  its 
function  self-defeating.     On  the  other  hand,  the  editing 
of  first-round  (or  any  round)  data  must  necessarily  be 
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somewhat  arbitrary.     When  opinions  are  aggregated  and  con- 
densed, certain  participants  will  inevitably  (and  sometimes 
justifiably)  feel  that  their  opinion  has  not  been  adequately 
represented  in  the  edited  feedback  version.     The  question 
of  validity  also  poses  a  difficult  problem.  (Thompson, 
1973,  p.  5) 

Because  of  the  number  of  participants  (14  and  11)  and  questions 
(174)  involved  in  the  Delphi  exercises  in  this  research,  verbal  type 
feedback  would  be  extremely  difficult,  awkward  and  time  consuming. 
Therefore  it  will  not  be  used  and  feedback  will  consist  solely  of  the 
median  group  response  and  interquartile  range  on  the  previous  round. 

Iterations  in  previous  Delphi  studies  have  ranged  from  1  to  6 
and  evidence  as  to  the  optimal  number  of  the  iterations  is  inconclusive. 
A  priori  the  number  of  iterations  cannot  be  determined.     Rather  the 
number  of  iterations  will  depend  on  (1)  the  degree  of  group  consensus, 
(2)  amount  of  convergence  between  rounds  and  (3)  the  receptivity  of 
group  members  to  participate  in  additional  iterations. 

Regression  Analysis 

The  functional  relationships  hypothesized  for  the  input-output 

measures  (see  p.  54)  will  first  be  examined  by  using  simple  linear 

regression  analysis.     The  regression  programs  in  Statistical  Package 

for  the  Social  Sciences  (SPSS)   (Nie  et  al. ,  1975)  will  be  employed  for 

this  purpose.     The  strength  of  total  relationships  will  be  assessed 
2 

by  examining  r    and  the  regression  coefficients  will  be  tested  for 
significance  with  an  F-test. 

Plots  of  the  data  and  residuals  will  be  examined  in  order  to  as- 
sess the  validity  of  the  linearity  assumption.     If  the  plots  suggest 
a  departure  from  linearity,  non-linear  regression  models  will  be  applied 
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to  the  data  using  the  SPSS  polynominal  regression  program. 

The  regressions  may  reveal  that  the  posited  relationships  are 
insignificant.     V<niile  this  may  be  due  to  the  actual  absence  of  re- 
lationships, it  could  also  be  due  to  the  nature  of  the  data  employed: 

1.  The  indicants  of  quality  are  based  on  opinions.     While  great 
care  is  being  exercised  in  obtaining  and  evaluating  these  opinions, 

no  conclusion  as  to  their  ultimate  validity  is  warranted. 

2.  Participant  hours,  staff  hours  and  volunteer  hours  are  the 
results  of  estimates  by  division  heads  in  the  GRD.     Thus  relationships 
actually  existing  may  be  obscured  by  the  unreliability  of  the  data. 

Statistical  Procedures  ' 
The  statistics  used  in  this  research  can  be  found  in  Blalock 
(1972) .     Most  of  the  statistical  computations  were  performed  by  using 
the  programs  in  SPSS.     Unless  indicated  otherwise  the  .05  level  of 
significance  is  being  used. 


CHAPTER  V 


DATA  COLLECTION  AW)  ANALYSIS 
Introduction 

This  chapter  is  concerned  with  the  collection  and  analysis  of 
the  data  specified  by  the  operational  model  presented  in  Chapter  III. 
The  methods  used  to  collect  and  analyze  this  data  were  discussed  in 
Chapter  IV.     The  data  collection  and  analysis  will  be  discussed  in  the 
following  order:     (1)  community  survey  data;   (2)  Delphi;   (3)  multivari 
able-multirater  analysis;   (4)  facility  adequacy;   (5)  validity  of  re- 
liance on  self -evaluations ;  and  (6)  input-output  relationships. 

Community  Survey  Data 

Collection 

The  final  questionnaire  revisions  were  completed  by  May  1,  1975. 
A  cover  letter  (Appendix  C)  from  the  researcher  to  questionnaire  re- 
cipients and  a  self -addressed,  postage  paid  return  envelope  were  then 
prepared.    After  the  questionnaires  (type  A  and  B) ,  cover  letters  and 
return  envelopes  had  been  printed,  they  were  placed  in  envelopes  and 
mailed,  on  May  28,  1975,  to  each  of  the  2,000  households  in  the  sample 

Different  colored  questionnaires  were  used  and  the  households 
in  each  cycle  were  sent  a  different  color  of  questionnaire.     This  made 
possible  the  Identification  of  the  general  geographic  location  of  re- 
spondents.    Questionnaires  A  and  B  were  alternated  within  each  cycle 
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(ABABAB...). 

The  first  responses  were  received  May  30  and  within  2  weeks  90% 
of  the  total  number  returned  had  been  received.     By  August  12,  486 
questionnaires,  representing  a  response  rate  of  24.3%,  had  been  received. 
The  number  of  questionnaires  mailed  and  returned  are  listed  by  cycle 
in  Table  3.     In  view  of  the  length  of  the  questionnaire,  the  response 
rate  was  better  than  expected.     However,  it  is  still  too  low  to  permit 
generalizing  the  results  of  the  survey  to  the  population  from  which 
it  was  selected.     This  inability  to  generalize  is  a  limitation  of  this 
research. 

Analyses 

Comparison  of  response  rates  between  A  and  B 

Because  A  and  B  were  identical  except  for  programs  (which  were 

randomly  assigned  to  each)  and  because  of  the  manner  in  which  they 

were  distributed  to  households,  no  significant  difference  in  response 

rates  was  expected  between  A  and  B. 

A  and  B  response  rates  were  compared  by  geographic  location  and 
2 

in  total  with  a  X    test.     The  results  of  the  2  tests  are  reported  in 
Table  4.     The  hypothesis  that  there  was  no  difference  in  response  rate 
could  not  be  rejected. 

Comparison  of  response  rates  between  geographic  locations 

One  of  the  reasons  for  obtaining  the  geographic  location  of  re- 
spondents was  to  help  in  identifying  response  biases.     Prior  knowledge 
of  (1)  the  effect  of  socioeconomic  characteristics  on  response  rates 
(Par ten,  1950,  p.  391)  and  (2)  differences  in  socioeconomic  characteristics 
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between  locations  in  Gainesville  (based  on  1970  census)  resulted  in  the 

hypothesis  that  response  rates  would  differ  between  locations.  The 

2 

null  hypothesis  of  no  difference  was  tested  with  X    and  it  was  rejected 
(see  Table  5).     An  examination  of  Table  3  reveals  that  Southeast,  North- 
east and  Southwest  with  some  Northwest  (Cycle  5)  are  underrepresented 
while  the  other  locations,  especially  Northwest  (white),  are  overre- 
presented.     Therefore,  if  the  opinions  of  respondents  also  differ  by 
geographic  location,  the  ratings  of  program  importance  and  quality  and 
facility  adequacy  will  be  biased  toward  the  overrepresented  areas. 

Statistics  produced 

The  following  statistics  were  computed  for  each  of  the  59  questions 
appearing  on  the  A  and  B  questionnaires:     (1)  frequency  distributions; 
(2)  histograms;   (3)  means;  and  (4)  standard  deviations.     The  mean  impor- 
tance and  quality  ratings  are  presented  in  Tables  19  and  20  (p.  115) 
and  the  mean  adequacy  ratings  are  presented  in  Table  28  (p.  137). 

Comparability  of  A  and  B  response  groups 

Because  of  the  methodologies  employed  in  questionnaire  preparation 

and  distribution  (see  pp.  63-64),  the  A  and  B  response  groups  were 

expected  to  be  very  comparable.     Indeed,  the  existence  of  significant 

differences  in  the  responses  of  A  and  B  to  identical  questions  would 

cast  doubt  on  the  stability  and  generalizability  of  the  measures  of 

importance,  quality  and  adequacy  which  were  developed  from  the  opinions 

of  survey  respondents. 
2 

X    was  used  to  test  the  hypothesis  of  no  difference  between  A 
and  B  in  regard  to 
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1.  socioeconomic  characteristics 

2.  ratings  of  importance,  quality  and  participation  for  the 

4  common  programs  (adult  ceramics,  springboard  diving  lessons,  summer 
track  and  tumbling  lessons)* 

3.  ratings  of  facility  adequacy.* 

The  comparisons  made  and  related  statistics  are  presented  for  the 
above  3  categories  in  Tables  6,  7,  and  8  respectively.     For  the  8 
socioeconomic  characteristics  (Table  6) ,  4  program  quality  ratings 
(Table  7)  and  4  participation  ratings  (Table  7)  the  hypothesis  that 
there  was  no  difference  between  A  and  B  could  not  be  rejected.  Of 
the  9  facilities  evaluated  (Table  8)  as  adequate  or  inadequate,  for 
only  1  (recreation  centers)  was  the  hypothesis  of  no  difference  re- 
jected.    For  the  4  program  importance  ratings,  however,  the  hypo- 
thesis of  no  difference  had  to  be  rejected  for  each  program  (Table  7). 

Although  the  preponderance  of  tests  performed  support  the 
hypothesis  that  A  and  B  are  comparable,  the  unanimous  rejection 
of  this  hypothesis  for  importance  ratings  was,  at  first  quite  puz- 
zling.    This  problematic  situation  led  to  a  search  for  a  plausible 
explanation. 

A  closer  analysis  of  the  programs  contained  on  the  A  and  B 
questionnaires  revealed  that  only  A  contained  the  2  recreation  programs 


*Pararaetrlc  tests  were  not  used  because  the  researcher  was  interested 
in  comparing  the  response  distributions  of  A  and  B. 
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with  the  greatest  public  participation  and  visibility:     park  and 
picnic  facilities  and  public  swimming.     These  2  programs  also  re- 
ceived the  highest  average  Importance  ratings  across  the  3  groups 
of  raters.     Now  if  the  respondents  had  rated  each  recreation  program 
relative  to  the  other  programs  contained  in  the  same  questionnaire, 
the  A  common  programs  would  not  have  received  as  high  a  rating  by 
the  A  group  as  by  the  B  group  (which  did  not  contain  any  programs 
rated  "high"  in  importance) .     A  comparison  of  the  mean  importance 
ratings  of  the  common  programs  between  A  and  B  (Table  9)  revealed 
that  the  means  of  B  exceeded  those  of  A  and  that  the  differences 
were  significant  (t_  test)  at  the  .05  level. 


TABLE  9 

COMPARISON  OF  A  AND  B  MEAN  IMPORTANCE  RATINGS 
FOR  COMMON  PROGRAMS 


Importance 

Program  A  B 

Adult  Ceramics  2.802  3.279 

Springboard  Diving  Lessons  2.589  3.149 

Summer  Track  2.930  3.283 

Tumbling  Lessons  2.595  3.090 


The  impact  of  context  on  psychophysical  measures  is  well 
recognized  (Parducci,  1974)  and  it  appears  that  the  different  programs 
appearing  on  the  A  and  B  questionnaires  created  contextual  effects 
which  resulted  in  the  observed  significant  difference  in  ratings 
of  program  importance.    While  context  should  affect  absolute  ratings 
of  importance,  it  should  not  affect  the  rank  order  of  the  ratings. 
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Pearson's  product-moment  correlation  coefficient  for  the  common 
program  means  is  ,9197  and  is  significant  at  .OA.     Thus  based  on 
rank  order,  the  importance  ratings  are  comparable. 

In  summary,  based  on  the  results  of  tests  performed,  the 
conclusion  that  A  and  B  response  groups  are  comparable  appears 
warranted. 

Validity  of  survey  responses 

An  evaluation  of  the  validity  of  the  opinions  of  survey  respond- 
ents was  made  in  order  to  determine  if  these  opinions  could  be  used 
in  this  research.     The  evaluation  was  based  on  both  qualitative  and 
quantitative  information. 

Qualitative .     Since  the  questionnaire  was  6  pages  long,  con- 
tained 59  questions  and  required  about  20  minutes  for  completion, 
it  does  not  seem  very  likely  that  someone  would  complete  it  unless 
he  had  a  serious  interest  in  it.     Given  such  interest,  legitimate, 
non-capricious  responses  would  be  expected.     Almost  all  of  the  question- 
naires were  completed  in  accordance  with  instructions — again  suggesting 
legitimate  responses.     Consistent  with  expectations,  park  and  picnic 
facilities  and  public  swimming  received  high  importance  ratings-^-if 
the  questionnaires  had  been  completed  capriciously,  such  high  ratings 
would  have  been  unlikely. 

Quantitative .     The  quantitative  assessment  of  validity  is 
based  on  the  relationship  between  program  participation  and  the  ex- 
pression of  an  opinion  on  program  quality.     Since  it  is  reasonable 
to  assume  that  for  most  people  knowledge  of  program  quality  comes 
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from  participation  in  the  program,  those  survey  respondents  who  do 

not  participate  in  a  program  would  be  expected  to  express  "no  opinion 

as  to  quality.     A  test  of  the  existence  of  such  a  relationship  can 

2 

be  conducted  by  use  of  the  X    statistic.     The  null  hypothesis  is 
that  participation  has  no  effect  on  the  expression  of  an  opinion 
on  quality.     The  contingency  table  used  to  test  this  hypothesis 
is  presented  in  Table  10  below.     The  test  was  conducted  for  each  of 
the  28  programs  contained  in  questionnaires  A  and  B.     For  all  28 
programs  the  null  hypothesis  that  participation  has  no  effect  was 
rejected.     For  27  programs  the  level  of  significance  was  .0001 
while  for  the  remaining  program  it  was  .004. 

Based  on  the  qualitative  and  quantitative  evidence,  the  con- 
clusion warranted  is  that  survey  responses  are  legitimate,  non- 
capricious  expressions  of  opinion  and  that  they  can  be  relied  upon 
for  the  purposes  of  this  research. 

TABLE  10 

CONTINGENCY  TABLE  FORMAT  FOR  TESTING  EFFECT 
OF  PARTICIPATION  ON  EXPRESSION 
OF  OPINION 

 Quality  

Opinion  No  Opinion 

Participation 

Don't  Participate 
Participate 
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Comparison  of  survey  socioeconomic  characteristics 
with  those  for  the  Gainesville  community 

The  distribution  of  the  socioeconomic  characteristics  of 
survey  respondents  was  compared  to  the  distribution  of  such  character- 
istics in  the  Gainesville  community  in  order  to  identify  the  existence 

2 

of  significant  response  biases.     The  X    statistic  was  used  for  this 
purpose.     Table  11  reflects  the  actual  and  expected  response  fre- 
quencies and  the  related  test  statistics. 

The  expected  frequencies  reported  in  Table  11  are  based  on 
the  Census  of  Population  and  Housing — Gainesville,  Florida,  Standard 
Metropolitan  Statistical  Area,  1970.     In  using  the  1970  census,  it 
is  assumed  that  the  distribution  of  socioeconomic  characteristics 
in  the  population  has  not  changed  significantly  since  1970.  Further- 
more in  making  comparisons  of  census  and  survey  data,  certain  inter- 
polations were  necessary  because  census  categories  were  sometimes 
slightly  different  from  those  in  the  survey.     Despite  these  limita- 
tions, the  census  data  is  believed  to  be  adequate  for  the  purpose 
of  this  research. 

The  Gainesville  Community  was  operationally  defined  as  all 
census  tracts  within  the  city  limits  plus  all  census  tracts  contiguous 
to  the  city  limits  (census  tracts  1  through  17). 

The  hypothesis  of  no  significant  difference  between  survey 
socioeconomic  characteristics  and  those  for  the  community  was  re- 
jected for  all  8  socioeconomic  characteristics.  The  major  survey 
biases  are  listed  below: 
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1.  Males  outnumber  females  2:1  in  the  survey,  but  for  the 
community  the  ratio  is  unity. 

2.  The  21-30  age  group  Is  considerably  overrepresented  while 
the  16-20  age  group  is  considerably  underrepresented , 

3.  While  64%  of  survey  respondents  have  a  college  degree, 
only  29%  of  those  in  the  community  do. 

4.  Married  people  are  overrepresented, 

5.  l^^hile  44%  of  the  survey  respondents  reported  family  incomes 
over  $15,000,  only  20%  of  family  incomes  in  the  community  exceed  this 
amount .  ' 

6.  College  students  are  overrepresented.  , 

7.  Those  outside  the  city  limits  are  overrepresented. 
Socioeconomic  characteristics  of  survey  respondents  are  not  repre- 
sentative of  those  in  the  community.     The  survey  is  especially  biased 
towards  those  with  a  high  level  of  education  and  income  and  those  in 
the  21-30  age  bracket.     Community  members  with  little  formal  edu- 
cation have  provided  almost  no  input. 

While  the  biases  found  in  the  socioeconomic  characteristics 
do  not  necessarily  entail  biases  in  ratings  of  importance,  quality 
and  adequacy,  the  burden  of  proving  they  do  not  must  be  borne  by 
anyone  who  wishes  to  generalize  from  the  survey  to  the  community 
at  large. 

Effect  of  geographic  location  on  ratings 

In  an  effort  to  determine  if  recreation  values,  needs  and 
experiences  were  uniform  throughout  the  community,  ratings  of 
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TABLE  12 

CONTINGENCY  TABLE  FORMATS  FOR  TESTING  EFFECT 
OF  GEOGRAPHIC  LOCATION  ON  IMPORTANCE, 
QUALITY  AND  ADEQUACY 


Predominantly  Southeast 

with  some  Northeast 
Northeast 

Predominantly  Southwest 
Northwest 

Predominantly  Southwest 

with  some  Northwest 
Northwest 


Importance 


Lov>;      Average  High 


Degrees  of 
Freedom 


Predominantly  Southeast 

with  some  Northeast 
Northeast 

Predominantly  Southwest 
Northwest 

Predominantly  Southwest 

with  some  Northwest 
Northwest 


10 


Quality 


Poor  Fair 


Good 


Predominantly  Southeast 

with  some  Northeast 
Northeast 

Predominantly  Southwest 
Northwest 

Predominantly  Southwest 

with  some  Northwest 
Northwest 


10 


Adequacy 


Adequate  Inadequate 
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TABLE  12 

CONTINGENCY  TABLE  FORMATS  FOR  TESTING  EFFECT 
OF  GEOGRAPHIC  LOCATION  ON  IMPORTANCE, 
QUALITY  AND  ADEQUACY 


Importance   Degrees  of 

Low      Average      High  Freedom 

Predominantly  Southeast 

with  some  Northeast 
Northeast 

Predominantly  Southwest 
Northwest 

Predominantly  Southwest 

with  some  Northwest 
Northwest  "   10 


 Quality  

Poor      Fair  Good 


Predominantly  Southeast 

with  some  Northeast 
Northeast 

Predominantly  Southwest 
Northwest 

Predominantly  Southwest 

with  some  Northwest 
Northwest    10 


Adequacy  

Adequate  Inadequate 

Predominantly  Southeast 

with  some  Northeast 
Northeast 

Predominantly  Southwest 
Northwest 

Predominantly  Southwest 

with  some  Northwest 
Northwest  5 
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importance,  quality  and  adequacy  were  compared  by  geographic  location. 
2 

X    vi7as  used  to  test  the  hypothesis  that  location  has  no  effect.  The 
contingency  table  formats  used  in  the  tests  are  presented  in  Table  12 
and  the  results  of  the  tests  are  reported  in  Table  13. 

Geographic  location  was  found  to  have  no  effect  on  importance 
ratings  and  little  effect  on  quality.     For  adequacy,  however,  3  of 
the  9  facility  ratings  differed  by  location. 

Based  on  the  above  results,  the  separate  reporting  of  ratings 
of  importance  and  quality  by  geographic  location  would  provide  only 
minimal  additional  information.     Furthermore,  the  observed  bias 
towards  certain  locations  (see  p.  82)  does  not  appear  to  be  signifi- 
cant insofar  as  the  measures  of  importance  and  quality  are  concerned. 
The  observed  differences  in  facility  adequacy  by  geographic  location 
should  be  reported  to  decision  makers  as  such  information  should  be 
useful  in  deciding  on  locations  for  new  facilities. 

TABLE  13 

EFFECT  OF  GEOGRAPHIC  LOCATION  ON  IMPORTANCE, 
QUALITY  AND  ADEQUACY 


Not 

Sig.  Sig. 

Importance  0  28 

Quality  3  25 

Adequacy  3  6 


Note:     X    was  used  to  identify  differences  significant 
at  .05  for  28  programs  and  9  facilities. 
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Effect  of  socioeconomic  characteristics  on  ratings 

The  contingency  table  formats  used  to  compare  importance, 

quality  and  adequacy  ratings  by  socioeconomic  characteristics  are 

2 

presented  in  Table  14.     The  results  of  the  X    tests  of  the  hypothesis 
that  socioeconomic  characteristics  have  no  effect  are  presented  in 
Table  15  and  are  summarized  below: 

1.  Importance  ratings  are  substantially  affected  by  sex, 
age,  level  of  education,  marital  status  and  college  status.  Family 
income  and  residence  had  some  effect.     Number  in  home  had  no  effect. 

2.  Quality  ratings  were  not  substantially  affected  by  any 
socioeconomic  characteristic. 

3.  Sex,  age,  level  of  education  and  family  income  had  some 
effect  on  adequacy  ratings.  -  . 

The  biases  toward  certain  socioeconomic  characteristics  (see  p.  98) 
coupled  with  the  effect  of  such  characteristics  on  importance  and 
adequacy  ratings  has  several  implications  for  this  research: 

1.  Generalization  of  these  ratings  to  the  community  at 
large  is  unwarranted. 

2.  Conclusions  based  on  these  ratings  can  only  be  tentative — 
more  representative  ratings  may  yield  different  results  necessitating 
different  conclusions. 

Because  quality  was  not  greatly  affected  by  socioeconomic  character-r- 
istics,  observed  biases  would  not  appear  to  be  as  critical  for  quality 
ratings . 
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TABLE  14 

CONTINGENCY  TABLE  FORMATS  FOR  TESTING  EFFECTS 
OF  SOCIOECONOMIC  CHARACTERISTICS  ON  IMPORTANCE, 
QUALITY  AND  ADEQUACY 


 Importance   Degrees  o 

Low      Average      High  Freedom 


Sex 
Male 
Female 

Age 
16-20 
21-30 
31-50 
51-65 
Over  65 

Level  of  Education 
No  College 
Some  College 
College  Degree 

Married 
Yes 
No 

Number  Living  In  Home 
1 
2 

Over  2 


Family  Income 
0-5,000 
5,001-10,000 
10,001-15,000 
15,001-20,000 
20,001-30,000 

Over  30,000    10 
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TABLE  14 — Continued 


 Importance   Degrees  of 

Low      Average      High  Freedom 


College  Student 
Yes 
No 

Live  in  Gainesville 
City  Limits 
Inside  City 
Outside  City 


Quality 


Poor      Fair  Good 

Sex 
Male 
Female 


Age 

16-20  .; 
21-30 
31-50 
51-65 
Over  65 

Level  of  Education 
No  College 
Some  College 
College  Degree 


Married 


Yes 
No 


2 
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TABLE  14~Continued 


 Quality  

Poor      Fair  Good 

Number  Living  in  Home 
1 
2 

Over  2   


Family  Income 
0-5,000 
5,001-10,000 
10,001-15,000 
15,001-20,000 
20,001-30,000 
Over  30,000 

College  Student 
Yes 
No 


Live  in  Gainesville 
City  Limits 
Inside  City 
Outside  City 


Adequacy  

Adequate  Inadequate 


Sex 
Male 
Female 
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Age 
16-20 
21-30 
31-50 
51-65 
Over  65 

Level  of  Education 
No  College 
Some  College 
College  Degree 

Married 
Yes 
No 

Number  Living  in  Home 
1 
2 

Over  2 

Family  Income 
0-5,000 
5,0001-10,000 
10,001-15,000 
15,001-20,000 
20,001-30,000 
Over  30,000 

College  Student 
Yes 
No 

Live  in  Gainesville 
City  Limits 
Inside  City 
Outside  City 


TABLE  14 — Continued 

Adequacy   Degrees  of 

Adequate    Inadequate  Freedom 

  4 

'   2 

  1 

2 


5 
1 
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Effect  of  participation  on  ratings  of  importance  and  quality 

The  survey  questionnaire  was  designed  so  that  the  opinions 

of  program  participants  could  be  compared  with  non-participants  in 

order  to  learn  if  significant  differences  of  opinion  existed  between 

2 

these  2  groups.     The  contingency  tables  used  for  the  X    test  are 
presented  in  Table  16  below.     Significant  differences  between  the 
ratings  of  participants  and  non-participants  were  found  for  18  of 
28  programs  in  regard  to  importance  and  for  8  of  28  programs  in 
regard  to  quality. 

TABLE  16 

CONTINGENCY  TABLE  FORMATS  FOR  TESTING  EFFECT  OF 
PARTICIPATION  ON  RATINGS  OF  IMPORTANCE  AND  QUALITY 


 Importance  

Participation 

Low  Average  High 

Participate 
Don't  Participate 

 Quality  

Poor  Fair  Good 

Participate 
Don't  Participate 

The  existence  of  substantial  difference  of  opinions  between 
participants  and  non-participants  indicates  the  need  for  decision 
makers  to  consider  both  groups  in  making  decisions  regarding  recre- 
ation resource  allocations.     Since  participants  have  the  most  to 


Ill 


gain,  they  are  likely  to  be  more  vocal  than  non-participants.  If 
their  views  are  the  only  input  to  the  decision  making  process,  non- 
optimal  decisions  may  result. 

Objective  Measures  of  Input  and  Output  Quantity 
Each  division  head  was  asked  to  provide  the  researcher  with 
estimates  of  output  quantity  (see  p.  50)  and  inputs   (see  p.  52)  for 
the  programs  in  his  division.     In  order  to  promote  the  comparability 
and  accuracy  of  these  estimates,  data  collection  forms  (see  Appendix  D) 
were  prepared  by  the  researcher  for  the  use  of  the  division  heads. 
After  the  forms  had  been  completed,  they  were  reviewed  by  the  researcher. 
Questions  about  the  data  reported  on  the  forms  were  discussed  with 
appropriate  division  head.     These  discussions  resulted  in  some  revi- 
sions of  the  original  estimates. 

The  division  heads  were  able  to  supply  the  researcher  with 
estimates  of  the  data  requested  for  45  of  the  original  55  programs. 
Difficulties  were  experienced  in  the  development  of  other  direct 
costs  (which  as  reported  herein  consist  primarily  of  materials  and 
supplies) : 

1.  Materials  used  by  the  Aquatics  Division  consist  mainly  of 
chemicals  for  water  treatment — any  allocation  to  individual  programs 
would  have  been  arbitrary. 

2.  For  Center  Division  programs  requiring  payment  of  a 
materials  usage  fee,  no  estimate  of  materials  costs  was  provided, 

3.  All  division  heads  stated  that  reliable  estimates  of 
utility  and  maintenance  costs  could  not  be  made. 
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The  data  obtained  are  listed  in  Tables  17  and  18.     Table  17 
contains  the  programs  evaluated  by  all  3  groups.     Table  18  contains 
the  programs  which  were  only  evaluated  by  the  GRD  and  PRAB. 

Delphi  • 

Data  Collection 

Delphi  questionnaires  (Appendix  E)  containing  all  55  recre- 
ation programs  and  9  facilities  were  given  to  the  lA  GRD  supervisors 
and  11  PRAB  members.     The  cover  letters  accompanying  the  questionnaires 
can  be  found  in  Appendix  F.    All  questionnaires  were  returned  to  the 
researcher  and  the  following  statistics  were  separately  computed  for 
each  group:     (1)  means,   (2)  standard  deviations,  (3)  frequency  distri- 
butions,  (4)  histograms,   (5)  medians,  and  (6)  interquartile  ranges. 
The  importance  and  quality  means  for  round  1  are  given  in  Tables  19 
and  20.     The  median  and  interquartile  range  (feedback)  for  the  impor- 
tance and  quality  of  each  program  were  placed  on  another  questionnaire 
(Appendix  E)  which  was  then  given  to  the  GRD  and  PRAB,    A  cover  letter 
(Appendix  F)  explaining  the  purpose  of  the  feedback  information  accom- 
panied the  questionnaires.    Again  all  questionnaires  were  returned  and 
statistics  were  computed  for  the  second  round  responses. 

Analysis 

The  following  4  criteria  for  the  existence  of  a  group  judgement 
were  discussed  in  Chapter  IV  (pp.  72-73): 

1.  change  and  convergence  with  feedback 

2.  group  reliability 
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3,  reasonable  distribution 

4.  consensus  in  group 

All  4  criteria  were  applied  to  the  opinions  of  the  GRD  and  PRAB 
and  the  last  3  were  applied  to  the  opinions  of  the  community,* 
The  results  and  conclusions  follow. 

Change  and  convergence  on  iteration  with  feedback 

This  criterion  requires  shifts  of  individual  responses  toward 
the  group  response  and  reduction  in  group  variability.     Based  on  the 
results  of  previous  research  with  group  value  judgements  (Dalkey  et .al, , 
1972), 

1.  significant  changes  in  mean  ratings  were  not  expected  between 

rounds 

2.  high  correlations  between  round  1  and  2  means  were  expected 

3.  standard  deviations  were  expected  to  decrease  from  round  1 
to  round  2. 

The  changes  which  occurred  in  program  importance  and  quality  means  and 
standard  deviations  between  round  1  and  2  are  presented  in  Table  21, 

While  changes  in  means  occurred  for  about  85%  of  the  programs, 
a  t  test  of  the  statistical  significance  of  the  changes  revealed  that 
none  of  them  were  significant.     Pearson  product-moment  correlation 
coefficients  between  the  55  program  means  of  round  1  and  2  were  high: 


*Since  the  opinions  of  community  members  were  obtained  but  once, 
criterion  1  was  not  applicable. 
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GRU  PRAB 

Importance  .9777*  .9575* 

Quality  ,9422*  ,9533* 

For  about  one-third  of  the  programs,  standard  deviations  failed  to 
decrease  between  rounds.     In  view  of  this,  an  overall  assessment 
of  convergence  was  made  by  comparing  the  mean  standard  deviations 
of  program  importance  and  quality  between  rounds: 

 GRD    PRAB  

Rd.  1  Rd.  2  Rd.  1  Rd.   2  ,  ; 

Importance  .7356  .6712  .8853  ,8265 

Quality        .7548  .6621  .9070  .7398 

Since  round  1  standard  deviations  exceed  round  2,  some  convergence 
has  occurred.     Except  for  the  importance  standard  deviations  of  the 
PRAB,  the  differences  are  statistically  significant  (t  test) . 

In  summary,  mean  ratings  behaved  as  anticipated  and  convergence 
while  less  than  expected,  did  occur.     The  criterion  of  "change  and 
convergence  on  iteration  with  feedback"  has  been  met  fairly  well. 

Since  no  significant  changes  in  mean  ratings  occurred  between 
rounds,  additional  iterations  did  not  appear  fruitful.  Furthermore, 
since  the  information  content  of  round  1  and  2  ratings  are  essen- 
tially the  same,  only  round  1  ratings  will  continue  to  be  used. 


*Signif leant  at  .001. 
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Group  reliability 

Dalkey  et  al.  state  that 

Given  two  similar  groups  (e.g.,  two  groups  selected  out  of 
a  larger  group  at  random)  the  group  judgements  on  a  given 
question  should  be  similar.     Over  a  set  of  such  value  judge- 
ments, the  correlation  for  the  two  subgroups  should  be  high. 
(1972,  p.  57) 

For  this  research  the  similar  groups  would  appear  to  be  the  GRD  and 
PRAB  and  community  groups  A  and  B.     The  Pearson  product-moment  cor- 
relation coefficients  between  ratings  for  these  2  set    of  groups  are 
given  below: 


Importance 
Quality 


GRD  &  PRAB 
(55  programs) 

.7961  (.001)* 
.6345  (.001)* 


Community  A  &  B 
(4  programs) 

.9197  (.04)* 
.8451  (.08)* 


Based  on  the  above,  the  reliability  criterion  would  appear  to  be  met. 


Reasonable  distribution 

Concerning  this  criterion  Dalkey  et  al.  write 

If  the  distribution  of  group  response  on  a  given  numerical 
value  judgement  is  flat,  indicating  group  indifference,  or 
if  it  is  U-shaped,  indicating  either  that  the  question  is 
being  interpreted  differently  by  two  subgroups,  or  there  is 
an  actual  difference  of  assessment  by  two  subgroups,  then 
it  seems  inappropriate  to  assert  that  the  group  considered 
as  a  unit  has  a  judgement  on  that  question.**  (1972,  p.  57) 


*Signif icance  level 

**Based  on  prior  Delphi  research,  a  single  peaked,  normal  type  dis- 
tribution is  expected  if  a  group  judgement  exists. 
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The  frequency  distributions  (histograms)  for  program  importance  and 
quality  for  the  GRD,  PRAB  and  community  were  examined  and  judged  to 
be  flat,  bimodal  or  single  peaked.*    The  results  are  reflected  in 
Table  22.     For  importance  96%  of  the  programs  satisfy  the  criterion 
of  "reasonable  distribution"  while  for  quality  80%  of  the  programs 
satisfy  it. 

In  summary,  the  criterion  of  "reasonable  distribution"  has 
been  met  for  most  programs. 

Consensus  in  group 

In  the  development  of  the  social  service  measurement  model 
for  the  JCF  (see  p.   30)  consensus  among  group  members  was  used  as 
a  criterion  for  group  judgement.     Unless  approximately  80%  of  the 
group  agreed  on  2  contiguous  categories  of  the  rating  scale,  a  group 
judgement  was  not  considered  to  exist. 

The  highest  percentage  of  group  members  expressing  an  opinion 
in  2  contiguous  categories  for  program  importance  and  quality  ratings 
are  presented  in  Table  23.     The  GRD  exhibits  the  greatest  degree 
of  consensus  and  is  followed  by  the  PRAB.    While  consensus  is  lowest 
for  the  community,  in  no  case  is  it  less  than  50%. 

The  degree  of  consensus  for  the  GRD  and  PRAB  at  the  end  of 
round  2  compares  very  favorably  with  the  degree  of  consensus  achieved 


*Based  on  prior  Delphi  research,  a  single  peaked,  normal  type  dis- 
tribution is  expected  if  a  group  judgement  exists. 
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for  the  JCF  at  the  end  of  round  4*  (Reisman  et  al.,  1970,  pp.  21-23)1 
In  summary  the  criterion  of  consensus  appears  to  have  been  met 
reasonably  well. 

Overall  conclusion 

It  is  the  conclusion  of  this  researcher  that  a  group  judgement 
exists  for  most  of  the  programs  being  evaluated. 

Use  of  Multivarlable-Multirater  Matrix  to  Assess  the  Validity 
of  Measures  of  Importance  and  Quality 

In  this  section  the  results  of  using  a  multivariable-multi- 
rater**  (M-M)  methodology  to  evaluate  the  validity  of  the  measures 
of  program  importance  and  quality  are  discussed.     The  following 
information  is  applicable  to  all  4  of  the  M-M  matrices  to  be  dis- 
cussed. 

Pearson's  product -moment  correlation  was  used  to  produce  the 
coefficients  for  the  matrices.     Reliabilities    are  in  parenthesis 
and  convergent  validity  coefficients  (hereafter  validities)  are 
underlined.     Each  monorater  triangle  is  enclosed  by  broken  lines 
and  each  heterorater  block  is  enclosed  by  solid  lines.     In  referring 
to  heterovariable  correlations,  the  column  name  will  always  precede 
the  row  name. 


*The  reason  for  only  2  rounds  in  this  research  was  stated  at  page  120 

**Based  on  the  multitrait-multimethod  methodology  of  Campbell  and 
Fiske  (1959). 
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The  evaluations  are  based  on  criteria  set  forth  by  Campbell 
and  Fiske  (1959) .     These  criteria  were  discussed  in  Chapter  IV  and 
are  restated  below  specifically  for  this  research: 

A.  Reliabilities  should  be  high  and  should  exceed  all  other 
correlation  coefficients. 

B.  Convergent  validity  is  demonstrated  by  high  validities, 

C.  Evidence  for  discriminant  validity  is  provided  by  3  criteria; 

1.  A  validity  coefficient  should  be  higher  than  the  coeffici- 
ents lying  in  its  column  and  row  of  the  heterorater  block. 

2.  A  validity  coefficient  should  be  higher  than  the  quality- 
value  coefficients  in  the  monorater  triangles. 

3.  The  same  pattern  of  variable  interrelationships  should 
be  shown  in  all  of  the  heterorater  blocks. 

In  the  evaluation  of  the  individual  matrices,  reliability,  convergent 
validity  and  discriminant  validity  will  be  considered.     For  discriminant 
validity,  the  3  separate  requirements  will  be  designated  by  the  numbers 
1,  2,  and  3  respectively. 

Three  Group  Matrix 

The  M-M  matrix  in  Table  2A  contains  the  correlation  coefficients 
for  the  24  programs  (see  Table  19,  p,  115)  evaluated  by  the  GRD,  PRAB 
and  community. 

Reliability.     Reliabilities  for  the  GRD  and  PRAB  are  based  on 
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correlations  between  round  1  and  2  program  means.*    They  are  quite 
high  and  suggest  that  the  measures  of  importance  and  quality  are 
stable.     Since  community  members  only  evaluated  the  programs  1  time, 
traditional  reliabilities  could  not  be  computed.     However,  some  indi- 
cation of  the  stability  of  the  community  measures  could  be  obtained 
from  the  correlations  between  the  means  of  the  programs  commonly 
evaluated  by  the  A  and  B  groups.     These  correlations  are  high,  again 
suggesting  stability  of  the  measures.     The  reliabilities  exceed 
all  other  coefficients  in  the  matrix. 

Convergent  validity.    All  validities,  except  quality  (.3932)  for 
PRAB  and  community,  are  of  sufficient  size  and  significance  as  to 
provide  evidence  of  convergent  validity.     Importance  validities  exceed 
those  for  quality  and  the  highest  validities  occur  between  the  GRD 
and  PRAB. 

Discriminant  validity. 

1.  While  importance  validities  exceed  the  coefficients  in  the 
column  and  row  of  the  heterorater  blocks,  the  validities  for  quality 
do  not. 

2.  Validities  do  not  exceed  all  the  heterovariable  coefficients 
in  the  monorater  triangles. 

3.  In  all  heterorater  blocks,  the  importance  validities  are 
the  highest  and  the  quality-value  coefficients  are  the  second  highest. 
No  other  consistent  interrelationships  exist. 


*Subsequent  reliabilities  for  GRD  and  PRAB  were  computed  in  a  similar 
m.anner . 
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Summary .     Evidence  for  convergent  validity  exists.     While  there 
is  some  evidence  of  discriminant  validity  for  importance,  there  is 
none  for  quality.     Importance  and  quality  are  highly  correlated. 

Two  Group  Matrix 

The  M-M  matrix  in  Table  25  contains  the  correlation  coeffi- 
cients for  the  55  programs  (see  Tables  19  and  20,  p.  115)  evaluated 
by  the  GRD  and  PRAB. 

Reliability.     Reliabilities  are  high  and  exceed  all  other  co- 
efficients in  the  matrix. 

Convergent  validity.     All  validities  are  high  and  provide 
evidence  of  convergent  validity.     The  importance  validity  coefficient 
exceeds  the  1  for  quality. 

Discriminant  validity. 

1.  The  importance  validity  coefficient  exceeds  the  coefficients 
in  its  row  and  column  of  the  heterorater  block,  but  the  quality 
validity  coefficient  does  not. 

2.  The  importance  validity  coefficient  exceeds  the  heterovari- 
able  coefficient  in  the  monorater  triangles.     The  quality  validity 
coefficient  does  not. 

3.  Not  applicable. 

Summary .     Evidence  for  both  convergent  and  discriminant  validity 
exists  for  importance.     Evidence  for  convergent  validity  but  not  dis- 
criminant validity  exists  for  quality.     Importance  and  quality  are 
highly  correlated. 
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Revised  Multivariable-Multlrater  Matrices 

The  correlations  in  the  preceding  matrices  were  based  on  all 
programs  evaluated.     However,  as  noted  previously  (pp.121  -125),  certain 
programs  failed  to  meet  the  Delphi  criteria  for  the  existence  of  a 
group  judgement.     In  addition  to  the  Delphi  criteria,  the  require- 
ment that  at  least  a  majority  of  GRD  and  PRAB  members  express  an 
opinion  on  a  program  was  also  imposed.     This  criterion  was  based  on 
the  belief  that  a  group  judgement  could  not  be  said  to  exist  unless 
a  majority  of  group  members  expressed  an  opinion.    A  question  of 
interest  is  would  convergent  and  discriminant  validity  improve  if 
those  programs  not  satisfying  the  above    criteria  were  eliminated? 

Using  the  above  criteria,  revised  3  and  2  group  sets  of  pro- 
grams were  prepared.     (Those  programs  designated  by  "a"   in  Tables 
19  comprise  the  3  group  set;  those  programs  designated  by  "b"  in 
Tables  19  and  20  comprise  the  2  group  set).     The  M-M  matrices  for 
the  revised  programs  are  shown  in  Tables  26  and  27  respective!}'. 

Three  Group  Revised  Matrix 

Reliability.     Reliabilities  are  high  for  GRD  and  PRAB  and 
exceed  all  other  coefficients.     Reliabilities  for  community  were 
not  calculated  as  2  of  the  4  commonly  evaluated  programs  were  elim- 
inated. 

Convergent  validity.     All  validities,  except  quality  for 
GRD  and  community,  are  high  and  provide  evidence  of  convergent 
validity.     Importance  validities  exceed  those  for  quality. 
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Discriminant  validity. 

1.  Importance  validites  exceed  the  coefficients  in  the  column 
and  rows  of  the  heterorater  blocks.     The  quality  validity  coefficient 
for  the  GRD  and  PRAB  exceeds  the  coefficients  in  its  column  and  row 
of  the  heterorater  block.     The  remaining  quality  validities  do  not. 

2.  The  validities  for  the  GRD  and  PRAB  and  the  importance 
validity  coefficient  for  the  PRAB  and  community  exceed  the  quality- 
importance  coefficients  in  the  monorater  triangles.     The  other  valid- 
ities do  not. 

3.  Importance  validities  exceed  all  other  coefficients  in 
the  heterorater  blocks.     No  other  consistent  interrelationships 
exist. 

Summary.     Evidence  for  convergent  validity  exists.  Evidence 
of  discriminant  validty  for  importance  exists.    While  some  evidence 
of  discriminant  validity  for  quality  exists,  it  is  not  strong. 
Importance  and  quality  are  highly  correlated. 

Two  Group  Revised  Matrix 

Reliability.     Reliabilities  are  high  and  exceed  all  other 
coefficients. 

Convergent  validity.     Validities  are  high  and  provide  evidence 
of  convergent  validity.     The  importance  validity  coefficient  exceeds 
the  1  for  quality. 

Discriminant  validity. 

1.     Validities  exceed  coefficients  in  column  and  row  of 
heterorater  block. 
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2.  Importance  validity  coefficient  exceeds  quality-importance 
coefficients  in  monorater  triangles.     Quality  validity  coefficient 
exceeds  quality-importance  coefficient  in  monorater  triangle  for 
PRAB  but  not  the  one  in  monorater  triangle  for  GRD. 

3.  Not  applicable. 

Summary .     Evidence  of  convergent  and  discriminant  validity 
exists  for  both  importance  and  quality.     However,  correlations  between 
importance  and  quality  are  still  large. 

Although  elimination  of  programs  not  meeting  the  criteria 
for  a  group  judgement  had  little  effect  on  convergent  validity,  dis- 
criminant validity  improved  and  quality  reflects  some  discriminant 
validity.     However,  the  correlations  between  importance  and  quality 
are  still  high.     While  the  changes  observed  are  not  sufficient  to 
warrant  a  conclusion  that  a  significant  improvement  in  validity  has 
occurred,  they  are  encouraging  enough  to  warrant  further  research. 

Conclusion  as  to  Validity  of  Measures  of  Importance  and  Quality 

Subject  to  the  limitations  imposed  by  response  biases  (see 
pp.   78  and    98^  the  following  overall  conclusions  are  derived  from 
the  results  of  the  M-M  evaluations.     The  importance  measures  possess 
both  convergent  validity  and  discriminant  validity.  Convergent 
validity  exists  for  the  quality  measures,  but  it  is  not  as  strong 
as  it  is  for  importance.     While  there  is  some  evidence  of  discrim- 
inant validity  for  quality,  it  is  unsatisfactory.     These  results 
indicate  that,  as  measured,  importance  is  more  valid  than  quality. 
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These  findings  probably  reflect  the  more  stable  and  uniform 
nature  of  importance  which  is  presumed  to  be  the  result  of  cultural 
values  developed  over  a  fairly  long  period  of  time.     Quality,  unlike 
importance,  is  more  likely  affected  by  recent  experiences  which  can 
be  expected  to  vary  from  individual  to  individual.     One  implication 
of  this  is  that  a  better  measure  of  quality  could  be  obtained  if  its 
measurement  occurred  at  the  end  of  each  program  (analogous  to  teacher- 
course  evaluations  at  the  end  of  each  quarter) . 

Facility  Adequacy 
The  mean  ratings  of  facility  adequacy  for  the  GRD,  PRAB  and 
the  community  are  presented  in  Table  28.     Pearson's  product -moment 
correlation  coefficient  was  utilized  to  assess  the  agreement  between 
the  3  groups.     The  coefficients  are  presented  below: 

PRAB  Community 

GRD  -.20  .42 

PRAB  -.13 

None  of  the  correlation  coefficients  is  significant  at  .05.  There 
is  very  little  agreement  on  the  adequacy  of  facilities.     This  lack 
of  agreement  strongly  implies  a  need  for  the  GRD  to  thoroughly  assess 
the  community's  recreational  facility  needs  prior  to  deciding  on 
new  facilities  or  expansion  of  old  ones. 

The  correlation  between  the  adequacy  ratings  of  community 
groups  A  and  B  is  .9624  (significant  at  .001).     This  very  high  cor- 
relation between  2  independent  samples  of  the  same  population  provides 
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evidence  of  the  stability  of  the  conununity  adequacy  measures  and 
of  the  comparability  of  the  A  and  B  groups. 

The  relationship  between  facility  adequacy  and  quality  of 
programs  was  examined  for  those  facilities  and  programs  which,  in 
the  opinion  of  the  researcher,  could  be  expected  to  be  interdependent. 
The  facilities  and  programs  are 

Parks-park  and  picnic  facilities 

Swimming  pools  -  skin  and  scuba  diving  lessons;  springboard 
diving  lessons; . public  swimming;  youth  swim 
lessons 

Racquetball  and  handball-racquetball  facilities 
In  each  case  of  rejection  the  facility  was  almost  identical  to  the 
program.     Of  those  for  which  the  null  hypothesis  could  not  be  re- 
jected, for  only  1  (archery)  were  the  facility  and  program  identical. 
For  the  other  4,  the  facility  is  merely  used  in  providing  the  program. 
While  the  paucity  of  facility-program  combinations  precludes  general 
conclusions,  the  test  results  suggest  that  there  are  different  types 
of  programs.     For  programs  in  which  the  facility  is  all-important, 
the  adequacy  of  that  facility  is  a  significant  factor  in  the  deter- 
mination of  quality.     For  programs  in  which  other  factors  are  involved 
and  may  predominate  (for  example  the  instructor  or  the  band)  ,  facility 
adequacy  is  not  a  significant  factor  in  the  determination  of  the 
quality  ratings. 
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Validity  of  Reliance  on  Self-Evaluations 


Since  program  evaluation  in  recreation  has  been  limited  pri 


marily  to  self-evaluation  studies,  1  purpose  of  this  research  was 
to  assess  the  validity  of  exclusive  reliance  on  information  from 
self -evaluations.     The  assessment  was  made  by  comparing  the  measures 
of  program  importance  and  quality  and  facility  adequacy  with  those 
produced  by  the  PMB  and  the  community. 

In  previous  sections,  the  measures  of  individual  program 
importance  and  quality  (p.  125)  and  facility  adequacy  (p.  136)  pro- 
duced by  the  GRD  were  correlated  with  those  produced  by  the  PRAB 
and  the  community.    While  for  Importance  and  quality  significant 
correlations  were  observed,  the  amount  of  unexplained  variance  is 
still  substantial  as  Table  29  indicates. 


TABLE  29 
VARIANCE  IN  GRD  MEASURES- 
EXPLAINED  AND  UNEXPLAINED 
BY  PRAB  AND  COMMUNITY 

MEASURES 


Variance 


r 


Explained 
r 


Unexplained 
1-r 


GRD  &  PRAB 
(55  programs) 

Importance 

Quality 


.79 
.63 


.63 
.40 


.37 
.60 


GRD  &  Community 
(24  programs) 

Importance 

quality 


.54 
.48 


.29 
.22 


.71 
.78 
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For  the  measures  of  facility  adequacy,  no  significant  correlations 
were  observed.     These  findings  suggest  that  the  information  content 
of  measures  produced  by  the  PRAB  and  community  may  be  fairly  high. 

A  comparison  of  average  GRD  ratings  with  those  of  the  PRAB 
and  community  was  also  made.     Because  of  the  GRD's  closeness  to 
recreation  activities  and  vested  interest  in  its  performance,  it 
was  hypothesized  that  the  measures  of  importance  and  quality  pro- 
duced by  the  GRD  would  exceed  those  of  the  other  groups.     In  order 
to  test  this  hypothesis,  a _t  test  was  used  to  compare  the  average 
of  the  program  importance  means  and  the  average  of  the  program 
quality  means  for  the  GRD  with  those  for  the  PRAB  and  the  conmiunity. 
The  average  of  the  program  means  for  the  GRD  and  the  community  (base( 
on  24  programs)  and  for  the  GRD  and  the  PRAB  (based  on  55  programs) 
are  presented  in  Table  30.     The  symbols  in  parenthesis  below  the 
averages  were  used  in  the  following  statement  of  the  null  and 
alternative  hypotheses: 

Null  hypotheses: 


"21 

"31 

°1Q 

"2Q 

"iQ 

"3Q 

"ll 

> 

"21 

"ll 

> 

"31 

"IQ 

> 

"2Q 

> 

"3Q 
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TABLE  30 

AVERAGE  RATINGS  OF  PROGRAM  IMPORTANCE  AND  QUALITY 


GRD 


Community 


Importance 
Quality 


Importance 
Quality 


3.5214 
(U^,) 

3.8701 


GRD 


3.5608 
3.9586 


3.2224 

3.5275 
(U3Q> 


PRAB 


3.3997 
(U^,) 

3.7025 
<"2Q> 


The  results  of  the  t  tests  were: 


Reject:  U^^      =  U^-^ 


"ll      =  "21 


Accept : 

For  all  4  comparisons,  direction  is  consistent  with  the  alternative 
hypothesis.     With  1  exception  the  _t  tests  provide  support  for  the 
existence  of  favorable  bias. 

The  results  of  both  the  individual  and  overall  comparisons 
indicate  that  information  from  self -evaluation  studies  should  be 
supplemented  with  information  from  sources  independent  of  those  being 
evaluated.     Ideally  the  incremental  value  of  this  information  would 
be  determined  and  equated  with  its  incremental  cost. 
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Input-Output  Relationships 

As  discussed  previously  (pp.  53  and  75),  linear  regression 
analysis  was  to  be  used  in  an  attempt  to  describe  the  nature  of 
the  relationships  between  labor  inputs  and  the  quantity  and  quality 
of  recreation  output.     The  simple  linear  model  applied  to  the  data 
is  illustrated  by  the  following  equation: 

y  =  a  +  bx  +  e 

where    y  =  dependent  variable 

X  =  independent  variable 
a  =  y  intercept 

b  =  change  in  y  with  respect  to  a  change  in 
X  (slope  of  regression  line) 

e  =  error  term  (difference  between  the  predicted 

y  (y)     and  the  observed  y) ;  since  it  is  assumed 
that  the  expected  value  of  e  is  0,  e  will 
drop  out  of  the  actual  prediction  equation. 

The  equations  for  which  a  and  b  will  be  estimated  are 

El.        U  =  a  +  bL;  n  =  45 

E2. 

E2.1  =  a^  +  b^L^;  Q  =  low  quality;  n  =  6 

E2.2  =        +  ^2^2'  ^  ~  average  quality;  n  =  25 

E2.3  =  A^  +  b^L^;  Q  =  high  quality;  n  =  13 

E3.        Q  =  a  +  b  (|);  n  =  45 

where    U  =  participant  hours  (output  quantity) 
L  =  labor  hours  (input) 
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Q  =  program  quality 

L/U  =  ratio  of  labor  hours  to  participant  hours 

n  =  number  of  progr.iais  for  which  preceding  measures 
were  obtained 

The  data  for  participant  hours,  labor  hours  and  quality  are  given 
in  Tables  17  and  18  (p.  113).     The  quality  measures  used  are  those 
of  the  GRD.*    The  L/U  ratio  was  computed  by  dividing  labor  hours  by 
participant  hours.     For  the  equations  in  E2,  the  observed  range  of 
program  quality  was  divided  into  3  categories:     low  quality  (2.8  -  3.4), 
average  quality  (3.5  -  4.2)  and  high  quality  (4.3  -  4.8).     The  par- 
ticipant hours  and  labor  hours  associated  with  programs  in  each  quality 
category  provide  the  data  for  equations  E2.1,  E2.2  and  E2.3. 

For  equations  El  through  E3 ,  the  following  was  obtained: 

1.  Scattergram  in  which  the  dependent  variable  is  plotted 
against  the  independent  variable 

2.  Statistical  tables  containing 
r  and  r^ 

standard  error  of  estimate 
least  squares  estimate  of  "a" 
least  squares  estimate  of  "b" 
standard  error  of  "b" 
F  value  and  level  of  significance 

3.  Plot  of  standardized  residuals  against  predicted  standard- 
ized dependent  variable 


*The  quality  measures  of  the  GRD  were  used  for  the  following  reasons; 
(1)  no  1  group's  (GRD,  PRAB  and  community)  quality  measures  have  been 
demonstrated  to  be  superior  to  the  other  2  groups;  and  (2)  since  the 
GRD  produced  the  measures  of  labor  hours  and  participant  hours,  all 
measures  used  for  the  regression  analyses  will  have  the  same  source. 
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Except  for  the  scattergraras ,  this  information  is  presented  in  Appendix 
H.     The  results  of  applying  the  simple  linear  models  to  the  data  will 
now  be  discussed. 

El.     U  =  a  +  bL 

The  estimated  prediction  equation  for  all  45  programs  is  U  =  -611 
2 

+  10. 7L.     The  r    of  .92  is  large  and  significant  at  .001.     However,  the 
presence  of  several  extremely  large  values  (see  Figure  4)  casts  doubt 

on  the  meaningf ulness  of  the  preceding  prediction  equation  and  its 

2  2 
r  .*    After  removing  the  6  most  extreme  values,*  the  r    drops  to  .46 

and  the  prediction  equation  becomes  U  =  1,120  +  7.03L.    The  95%  confi- 
dence interval  for  U  is  U  -  5,512<U<U  +  5,512.     Considering  the  range 
in  U  (264  -  11,955),  this  confidence  interval  is  very  large  and  indi- 
cates the  impreciseness  of  the  relationship  between  U  and  L.  An 
examination  of  the  scattergram  (figure  5)  reveals  a  weak  linear  re- 
lationship, especially  for  values  of  L  greater  than  400  hours. 

E2.     U  =  a  +  bL  ^ 

E2 . 1    Low  quality 

The  scattergram  for  low  quality  (Figure  6)  reveals  1  extreme 

2 

outlier.  After  removing  it,  an  r  of  .74  is  obtained  and  the  pre- 
diction equation  is  U  =  155  +  3.76L.     The  linear  relationship  (see 


*Blalock  pojnts  out  that  a  few  extreme  values  may  produce  a  high  r  (and 
therefore  r  )  where  none  exists  among  the  other  values;  he  states  that 
where  it  is  not  feasible  empirically  to  include  more  extreme  values, 
consideration  should  be  given  to  excluding  the  extreme  values  and  to 
reporting  the  range  of  varability  (1972,  pp.  381-383). 
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Figure  7)  is  weak  and  a  95%  confidence  interval  for  U,  U  -  328<U<U  +  328, 
is  almost  as  large  as  the  range  of  U  (84  -  798) . 

E2.2    Average  quality 

The  scattergram  for  average  quality  relfects  1  extreme  outlier 

2 

(Figure  8).     After  removing  it,  an  r    of  .34  is  obtained  and  the  pre- 
diction equation  is  U  =  798  +  7.66L.     The  linear  relationship  is  weak 
(see  Figure  9)  and  a  95%  confidence  interval  for  U,  U  -  5,140<U<U  +  5,140, 
is  as  large  as  the  range  of  U  (80  10,400). 

E2.3    High  quality 

The  scattergram  for  high  quality  reflects  several  extreme  outliers 

2 

(Figure  10)  and  when  they  are  removed  an  r    of  .48  is  obtained.  The 
prediction  equation  is  U  =  3,839  +  3.63L  and  the  95%  confidence  inter- 
val for  U  is  U  -  6,494<U<U  +  6,494.     While  the  relationship  between  U 
and  L  appears  to  be  linear,  it  is  very  weak  (Figure  11). 

E3.     Q  =  a  +  b  (|) 
2 

The  r    for  the  45  programs  is  .001  and  is  totally  insignificant. 
The  scattergram  (Figure  12)  shows  no  relationship  (linear  or  non-linear) 
between  quality  and  the  labor  hour-participant  hour  ratio.     In  the 
prediction  equation  Q  =  3.96  -  .05  (^) ,  the  hypothesis  that  b  =  0  cannot 
be  rejected  (F  =  .002  and  is  not  significant). 

Overall  Conclusion 

Based  on  the  preceding  analyses,  the  following  tentative 
conclusions  are  presented: 
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1.  While  ^>o  as  expected,  the  linear  relationship*  between 

dL 

U  and  L  is  very  weak  and  the  prediction  equations  are  too  imprecise 
for  use  in  decision  making.     Segregating  programs  by  level  of  qual- 
ity (E2)  did  not  result  in  a  significant  improvement  in  the  strength 
of  the  relationship  between  U  and  L. 

2.  It  appears  that  ^%-n=  0.     There  is  no  apparent  relation- 

d  ■— 

ship,  linear  or  non-linear l^^etween  quality  and  the  labor  hour- 
participant  hour  ratio. 

3.  The  unsatisfactory  relationships  may  well  be  due  to  the 
fact  that  each  program  is  unique.     Such  being  the  case,  the  analyses 
performed  are  inappropriate  and  the  relationship  between  inputs  and 
outputs  needs  to  be  assessed  program  by  program.     For  this  type  of 
analysis,  different  levels  of  input  and  output  for  each  program  are 
needed.     Since  this  research  was  confined  to  1  year,  the  data  needed 
for  such  an  analysis  was  not  available. 


*Based  on  examination  of  scattergrams  and  residual  plots,  non-linear 
relationships  are  not  appropriate. 


CHAPTER  VI 


CONCLUSIONS 
Introduction 

The  overall  purpose  of  this  research  was  to  provide  evidence 
as  to  the  feasibility  and  efficacy  of  performance  measurement  in  the 
NFP  area.     In  order  to  accomplish  this  purpose,  existing  methodologies 
for  NFP  performance  measurement  were  reviewed  and  assessed  (Chapter 
II) ;  a  performance  measurement  model  for  the  GRD  was  developed  (Chap- 
ter III) ;  and  the  data  specified  by  the  model  was  collected  and  analyzed 
(Chapter  V) .     Involved  in  the  data  collection  and  analysis  were  new 
applications  of  the  Delphi  technique  and  the  multitrait-multimethod 
methodology  and  an  assessment  of  the  usefulness  of  self-evaluation 
studies  in  performance  measurement. 

In  this  chapter  the  conclusions  reached  in  regard  to  the  above 
will  be  set  forth  and  directions  for  future  research  activity  will  be 
discussed. 

Performance  Measurement  Methodologies 
Need  for  a  Comprehensive  Work  on  Performance  Measurement 

In  this  research  an  attempt  was  made  to  identify  methodologies 
which  have  been  proposed  for  measuring  performance  in  the  NFP  area. 
The  identification  process  revealed  that  while  numerous  methodologies 
exist,  they  have  generally  been  treated  in  isolation  from  one  another. 
A  comprehensive  work  on  NFP  performance  measurement  is  needed  and 
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would  be  of  considerable  value  to  NFP  administrators  and  researchers. 
Also  needed  is  (1)  a  thorough  survey  of  both  the  situations  in  which 
the  methodologies  have  been  applied  and  (2)  the  results  of  these  ap- 
plications.    The  survey  should  be  designed  so  as  to  permit  the  deter- 
mination of  the  state  of  development  of  performance  measurement  by  type 
(government,  religious,  educational,  etc.)  and  level  (e.g.,  govern- 
ment could  be  classified  as  federal,  state  and  local)  of  NFP  organization. 

An  Improved  Performance  Measurement  System 

The  review  and  assessment  of  NFP  performance  measurement  method- 
ologies resulted  in  the  conclusion  that  certain  methodologies  are  com- 
plementary and  can  be  combined  into  an  improved  performance  measure- 
ment system.     The  methodologies  are  (1)  experimental  and  quasi-experi- 
mental research  designs  (see  p.  26)  and  cost-benefit  analysis  (see  p. 
14) .     The  first  method  provides  the  most  valid  and  effective  way  to 
determine  a  program's  impact,  but  it  cannot  be  used  to  determine  the 
program's  social  desirability.     The  second  method  is  the  best  way  known 
of  assessing  the  social  desirability  of  a  program,  but  it  cannot  be 
used  to  determine  the  program's  impact.     Since  both  Impact  and  social 
desirability  should  be  established  before  implementation  of  a  program, 
the  methods  should  be  used  in  conjunction  with  one  another. 

Using  the  combined  system,  the  impact  of  a  program  would  first 
be  determined  by  use  of  experimental  or  quasi-experimental  research 
designs  in  a  pilot  study  of  the  proposed  program.     Assuming  the  impact 
anticipated  was  found  to  exist,  a  cost-benefit  analysis  could  be  used 
to  assess  the  social  desirability  of  the  program.     If  the  program  was 
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found  to  be  both  effective  and  socially  desirable,  it  would  be  imple- 
mented.    Its  implementation  should  be  conducted  within  an  experimental 
(or  quasiexperimental)  framework  in  order  to  control  for  the  effects 
of  confounding  variables.     Periodically  the  program  should  be  reeval- 
uated by  comparing  actual  with  projected  effects,  costs  and  benefits. 
Material  discrepancies  should  be  investigated  with  a  concomitant  re- 
finement of  methodological  assumptions  and  projections. 

Feasibility  of  Performance  Measurement 

The  review  and  assessment  of  the  current  state  of  performance 
measurement  in  the  NFP  area  revealed  a  widespread  belief  that  perform- 
ance measurement  is  both  feasible  and  efficacious.     This  belief  is 
supported  by  the  existence  of  several  operational  performance  measure- 
ment methodologies  (cost-benefit  analysis;  cost-effectiveness  analysis; 
PPB;  experimental  and  quasiexperimental  research  designs  for  program 
evaluation) .    While  implementation  of  these  methodologies  is  not  yet 
widespread,  the  potential  appears  great.     The  lack  of  implementation 
appears  to  be  due  to 

1.  lack  of  administrative  familiarity  with  methodologies 

2.  reluctance  of  administrators  to  be  evaluated 

3.  lack  of  information  systems  to  supply  the  data  needed  to 
implement  the  methodologies. 

One  approach  to  reducing  problems  1  and  2  is  for  researchers  interested 
in  the  NFP  area  to  work  with  NFP  administrators  in  order  to  inform 
them  of  the  benefits  and  limitations  of  performance  measurement  methods 
and  assist  them  in  the  applications  of  these  methods. 
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A  first  step  towards  eliminating  the  third  obstacle  to  imple- 
mentation is  to  formally  identify  the  activities  (programs,  projects, 
etc.)  of  NFP  entities  and  to  develop  measures  of  input  and  output  for 
each  activity.     While  traditional  methods  (e.g.,  a  cost  accounting 
system)  appear  adequate  for  measuring  inputs,  output  measurement  will 
sometimes  require  the  use  of  more  novel  methods  (for  example,  Delphi 
and  the  raultitrait-multimethod  methodology) .     Given  input  and  output 
measures,  an  information  system  to  provide  these  measures  to  decision 
makers  can  be  established. 

Performance  Measurement  Model 
Summary  of  Model  Development 

With  the  cooperation  of  the  GRD,  a  performance  measurement  model 
was  developed  for  the  GRD.     Since  the  model  developed  (Figure  2,  p. 
45)  goes  well  beyond  the  traditional  budgetary  model  characteristic 
of  most  NFP  organizations,  the  ability  to  collect  the  data  specified 
by  the  model  was  of  primary  concern  to  the  researcher.    The  data  col- 
lection focused  on  measures  of  input  and  output  for  GRD  recreation  pro- 
grams.    Since  2  of  these  output  measures  (importance  and  quality) 
were  subjective  in  nature,  their  validation  was  a  key  research  objective. 

With  the  assistance  of  GRD  supervisors,  major  recreation  programs 
were  identified.     For  most  of  these  programs,  the  GRD  was  able  to  pro- 
vide estimates  of  the  following  objective  inputs  and  outputs: 

1.  direct  costs 

2.  labor  hours  of  input 

3.  participant  and  spectator  hours  (quantity  of  output) 
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4.     user  fees 

The  subjective  output  measures  of  program  importance  and  quality  were 
produced  from  opinions  of  GRD  supervisors,  FRAB  members  and  a  sample 
of  residents  in  the  Gainesville  community.     These  same  groups  also 
provided  measures  of  facility  adequacy. 

The  validity  of  the  measures  of  program  importance  and  quality 
was  assessed  by  means  of  the  Delphi  technique  and  the  multitrait- 
multimethod  methodology  (multivariable-multirater) .    The  facility 
adequacy  measures  of  the  3  groups  were  compared  for  agreement.  The 
relationships  between  the  quantity  and  quality  of  program  output  and 
labor  inputs  were  examined  by  means  of  regression  analysis. 

Conclusions 

Objective  input-output  measures  . 

The  "program"  was  found  to  be  a  useful  unit  of  account  for  the 
measurement  of  inputs  and  outputs.     While  a  lack  of  research  resources 
precluded  the  measurement  of  objective  program  inputs  and  outputs  as 
well  as  desired,  the  active  participation  of  GRD  supervisors  in  pro- 
viding this  information  indicates  the  potential  for  formally  collecting 
it.     The  researcher  is  convinced  that  with  sufficient  resources,  valid 
and  reliable  objective  input-output  measures  can  be  obtained. 

Subjective  output  measures 

Based  on  Delphi  criteria,  valid  group  judgements  of  program  im- 
portance and  quality  were  found  to  exist.     Based  on  the  multitrait- 
multimethod  (multivariable-multirater)  criteria,  the  measures  of  program 
importance  and  quality  produced  from  the  group  judgements  are  believed 
to  be  valid.     The  success  achieved  in  measuring  and  validating  the 
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importance  and  quality  of  recreation  programs  suggests  that  subjective 
measures  are  well  within  the  purview  of  current  methodology.     To  the 
extent  that  subjective  measures  are  found  useful  to  NFP  decision  makers, 
they  can  and  should  be  provided. 

Facility  adequacy 

While  high  agreement  was  found  to  exist  between  the  facility 
adequacy  measures  of  community  groups  A  and  B,  almost  no  agreement  was 
observed  between  the  community,  GRD  and  PRAB.     This  finding  suggests 
the  need  for  a  thorough  assessment  of  the  recreational  facility  needs 
of  the  Gainesville  community  prior  to  the  expansion  of  recreational 
facilities. 

Input-output  relationship 

Although  the  application  of  regression  analysis  failed  to  reveal 
satisfactory  relationships  between  output  quantity  and  quality  and 
labor  inputs,  a  conclusion  that  such  analysis  is  inappropriate  or  that 
there  are  no  relationships  would  be  premature.     "Better"  measures  of 
labor  input  and  quantity  of  output  should  be  obtained  and  longitudinal 
as  well  as  crossectional  analysis  should  be  performed.  Furthermore 
other  input  measures  (e.g.  materials)  should  be  obtained  and  incorporated 
into  the  analyses. 

Overall  conclusions 

The  successful  development  of  the  performance  measurement  model 
for  the  GRD  and  the  collection  and  analysis  of  the  data  required  for 
its  implementation  provide  evidence  that  NFP  performance  measurement 
is  feasible.     The  model  developed  appears  to  meet  Knighton's  (1972) 
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requirement  for  a  system  which  permits  the  matching  of  operating  ex- 
penses with  information  that  provided  an  indication  of  public  benefits. 

Future  Applications  of  Model 
Usefulness  of  model  information 

The  usefulness  of  the  information  produced  by  the  model  was  as- 
sumed by  the  researcher.     Now  that  the  information  has  been  obtained, 
its  actual  usefulness  to  Gainesville  decision  makers  needs  to  be  as- 
sessed.    This  assessment  should  encompass  both  the  cost  of  and  value 
(in  use)  of  the  specific  items  of  model  information.     The  groups  pre- 
sently identified  for  participation  in  the  evaluation  of  the  model's 
usefulness  are  the  GRD,  PRAB,  City  Manager's  Office  and  the  Gainesville 
City  Commission. 

Replication  of  model 

Because  of  resource  constraints  (both  money  and  time) ,  numerous 
limitations  were  imposed  on  the  data  collected.     Therefore,  this  re- 
search needs  to  be  replicated  with  sufficient  resources  to  remove  these 
limitations.     With  adequate  resources,  the  following  improvements  are 
believed  to  be  possible: 

1.  Program  identification  would  be  exhaustive. 

2.  Having  identified  all  programs,  an  information  system  to 
collect  the  model  data  would  be  implemented.     This  would  eliminate  the 
need  to  rely  on  estimates  for  objective  input  and  output  measures. 
The  data  would  be  collected  for  several  years  (longitudinal  study) 

and  therefore  input-output  relations  could  be  examined  for  individual 
programs  as  well  as  across  programs. 
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3.    The  opinions  of  a  representative  sample  of  community  members 
would  be  obtained  through  use  of  second  requests  and  interviews.  A 
random  sample  of  holdouts  would  be  contacted  and  paid  to  be  interviewed. 

A.     The  opinions  of  known  program  (isers  (based  on  2  above)  would 
be  obtained  for  comparison  with  opinions  in  3  above. 

5.  Program-specific  as  well  as  general  program  quality  measures 
would  be  obtained. 

6.  Contextual  effects  would  be  controlled  for.    More  than  one 
method  (in  order  to  identify  methods  variance  in  the  multivariable- 
multirater  matrix)  would  be  used  to  generate  measures  of  program  im- 
portance and  quality. 

Delphi 

The  Delphi  questionnaires  proved  to  be  a  satisfactory  means  of 
obtaining  opinions  from  GRD  supervisors  and  PRAB  members.  Several 
participants  did,  however,  question  the  information  content  of  the 
statistical  feedback  and  in  future  research  experimentation  with  both 
statistical  and  verbal  feedback  is  recommended. 

The  reasonable  distribution  criterion  for  group  value  judgements 
can  be  used  to  identify  problem  areas— for  those  activities  about  which 
disagreement  exists  (bimodal  distribution),  an  investigation  can  be 
undertaken  to  determine  if  the  differences  of  opinion  are  due  to  (1) 
semantics,   (2)  values,  or  (3)  the  existence  of  different  sets  of  equally 
valid  facts.     Such  a  management-byecception  tool  should  be  of  value - 
to  decision  makers. 
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Multivariable-Multlrater  Matrix 
The  multivarlable-multlrater  matrix  was  found  to  be  very  useful 
In  assessing  the  validity  of  measures  of  program  Importance  and  quality. 
If  only  convergent  validity  had  been  assessed,  the  high  correlations 
between  Importance  and  quality  would  not  have  been  noted  and  erroneous 
conclusions  concerning  the  validity  of  the  measures  might  have  been 
made. 

Because  only  one  method  (5  category  rating  scale)  was  used  for 
the  measurement  of  importance  and  quality,  it  was  not  possible  to  i- 
dentify  the  amount  of  method  variance.     In  future  research  more  than 
1  method  as  well  as  more  than  1  variable  and  rater  group  should  be 
used .  .  , 

Reliance  on  Self-Evaluations  .  .  .  . 

While  self-evaluation  studies  may  provide  much  valuable  infor- 
mation, the  results  of  this  research  indicate  that  exclusive  reliance 
on  such  studies  to  assess  an  entity's  performance  is  unjustified  and 
may  actually  be  misleading  (as  in  the  case  of  recreation  facility  ad- 
equacy) .     To  the  extent  possible,  self-evaluations  should  be  augmented 
by  outside  appraisals.     Furthermore,  in  order  to  assure  the  objectivity 
of  the  evaluations,  an  agency  Independent  of  the  entity  being  evaluated 
(in  the  case  of  the  GRD,  a  separate  program  evaluation  department  in 
the  City  government  could  be  used)  should  be  responsible  for  validating 
the  infomation  obtained. 

The  comparison  of  self  with  outside  evaluations  can  help  decison 
makers  to  identify  areas  in  need  of  more  thorough  investigation.  An 
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example,  from  this  research,  will  serve  to  illustrate  such  usage. 
Although  the  adult  flag  football  program  received  the  lowest  importance 
rating  awarded  by  the  community,  it  was  rated  high  in  importance  by 
the  GRD.    An  investigation  of  the  program  might  reveal  that  its  cost 
is  completely  absorbed  by  user  fees  and  therefore  the  program  is  no 
burden  to  the  general  community.     If,  however,  it  were  found  that  the 
program  cost  is  borne  primarily  by  the  community  and  that  only  a  few 
members  of  the  conmiunity  benefit  from  its  provision,  then  the  GRD  should 
have  to  demonstrate  why  the  general  community  (through  taxes)  should 
bear  the  cost  of  this  program. 


APPENDIX  A 


PROGRAMS  AND  FACILITIES  OF 
THE  GAINESVILLE  RECREATION  DEPARTMENT 


PROGRAMS  PROVIDED  BY 
AQUATICS  DIVISION 

Youth  swim  lessons 
Adult  swim  lessons 

Water  safety  and  lifesaving  instruction 
Springboard  diving  lessons 
Water  ballet 

Skin  &  scuba  diving  lessons 
Competitive  swimming 
Public  swimming 
Swim  meets 
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PROGRAMS  PROVIDED  BY 
ATHLETICS  DIVISION 

Tennis  lessons 
Tennis  tournaments 
Archery  activities 
Youth  basketball 
Adult  basketball 
Youth  football 
Adult  flag  football 
Youth  baseball 
Adult  Softball 
Girl's  Softball 
Racquetball  tournaments 
Track  and  field  day- 
Golf  lessons 
Women's  volleyball 
Summer  track  program 
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PROGRAMS  PROVIDED  BY 
CENTERS  DIVISION 

Youth  ceramics 
Adult  ceramics 
Youth  arts  &  crafts 
Adult  arts  &  crafts 
Pre-school  training 
Tumbling  lessons 
Baton  lessons 
Square  dance  lessons 
Cooking  lessons 
Teen  nutrition  lessons 
Modern  dance  lessons 
Adult  exercise  lessons 
Duplicate  bridge 
Recreation  center  dances 
Recreation  center  games 
Wrestling  lessons 
Senior  citizen  activities 
Sewing  lessons 
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PROGRAMS  PROVIDED  BY 
PLAYGROUNDS  DIVISION 

Adult  gymnastics 

Youth  gymnastics 

Easter  egg  hunt 

Bowling  lessons 

Camping  skills  instruction 

Drama  workshop  &  play 

Children's  art  display 

Cheerleading  clinic 

Supervised  playground  activities 


Park  and  picnic  facilities 
Racquetball  facilities 
Recreation  center  facilities 
Tennis  facilities 
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FACILITIES  MAINTAINED  BY 
GAINESVILLE  RECREATION  DEPARTMENT 

Archery  range 

Baseball  and  Softball  fields 
Park  and  picnic  areas 
Playgrounds 

Racquetball  and  handball  courts 
Recreation  centers 
Switmning  pools 
Tennis  courts 
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APPENDIX  C 


COVER  LETTER  FROM  RESEARCHER 
TO  COMMUNITY 


May  19,  1975 


Dear  Area  Resident:  ' 

I  am  a  Ph.D.  candidate  in  Accounting  at  the  University  of  Florida. 
With  the  cooperation  of  the  Gainesville  Recreation  Department,  I  am 
conducting  a  study,  for  my  doctoral  thesis,  of  the  importance  and 
quality  of  recreational  programs  provided  by  the  Recreation  Depart- 
ment . 

A  scientifically  determined  group  of  residents  in  the  Gainesville 
community  has  been  selected  to  represent  the  opinions  of  the  community. 
As  a  member  of  that  group,  your  opinions  will  play  an  important  role 
in  helping  the  Gainesville  Recreation  Department  provide  those  pro- 
grams which  contribute  most  to  the  enjoyment  of  life  in  the  Gainesville 
area.     By  taking  a  few  minutes  to  complete  the  enclosed  questionnaire, 
you  will  help  the  Gainesville  Recreation  Department  serve  you  and  the 
community  better. 

Please  read  carefully  the  introductory  letter  by  the  Director 
of  the  Recreation  Department  and  then  complete  the  questionnaire. 
Please  do  not  sign  the  questionnaire.     Your  identity  will  remain  un- 
kno;%m.     A  stamped  self-addressed  envelope  is  enclosed  for  your  reply. 

Thank  you  for  your  cooperation. 

Sincerely, 


Marcus  Dunn 

MD/meb 
Enclosure 


188 


APPENDIX  D 
OBJECTIVE  DATA  COLLECTION  FORMS 


Where  Held: 
When  Held: 


Fees  paid  by  participant: 
Instructor  fee 

Supplies/materials  usage  fee 


Participation: 

Number  of  classes  offered  per  year 
Number  of  meetings  per  class 
Average  number  of  participants  per  meeting 
Number  of  hours  a  meeting  lasts 

Direct  Costs  to  Gainesville  Recreation  Department 

Number  of  instructors  per  meeting 

Type  of  instructor  and  rate  of  pay: 

Per  Hour  Pay  Rate 

Part-time  staff     

Full-time  staff:   

Supv.  II     

Supv.  I     

Aide  II    ~   

Aide  I     

Volunteer 


Supplies/materials  cost  per  class 
Equipment  cost  per  class 
Other  direct  costs  per  class: 
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Participation 

Number  of  meetings  per  activity  per  year 
Average  number  of  hours  each  meeting  lasts 
Average  number  of  participants  per  meeting 


Direct  Costs  to  Gainesville  Recreation  Department 


Personnel  Cost:  Number  of  Hours  Pay  Rate 

Per  Meeting  Per  Hour 

Part-time  Staff     

Full-time  staff: 

Supv.  II   '   

Supv.  I     

Aide  II     

Aide  I    _____ 

Volunteer 


Materials/supplies  cost  per  meeting 
Utilities  cost  per  meeting 
Maintenance  cost  per  meeting 
Other  cost  per  meeting   
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Participation 

Number  of  recreational  swimmers  per  year   

Average  number  of  hours  of  pool  use  per  swimmer 

per  visit  to  pool   

Direct  Costs  to  Gainesville  Recreation  Department 

Number  of 

Personnel  Cost:  Number      Pay  Rate  Per  Hour    Hours  Per 

Person 

Cashier       

Lifeguard  _____     

Manager       

Other   


Materials/supplies  cost  per  year 
Utilities  cost  per  year 
Maintenance  cost  per  year 


Other  costs  per  year 
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Entry  fee  per  participant:   

Where  held;  

When  held: 


Participation 

Number  of  participants 
^       Average  number  of  hours  of  participation 
per  participant 

Spectators 


Number  of  spectators 

Average  number  of  hours  of  viewing  by  spectator 


Direct  Costs  to  Gainesville  Recreation  Department  , 
Personnel: 

Hours  worked      Number      Pay  Rate 

Part-time       

Full-time 

Supv.  11       

Supv.  I  

Aide  II     

Aide  I 


Volunteer 


Supplies/materials  cost 
Equipment  cost 
Trophies,  prizes,  etc. 
Other  (list)   


Entry  fee: 
Wliere  played 
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Time  played: 


%  of  games  requiring  light 


Participation 

Players : 

Number  of  teams 

Number  of  games  played  by  each  team 
Actual  number  of  players  per  team 
Average  number  of  hours  a  game  lasts 

Number  of  teams 

Number  of  practice  sessions  per  team 

Average  number  of  players  per  team  practice  session 

Average  number  of  hours  a  practice  session  lasts 

Spectators: 

Average  number  of  spectators  per  game 

Volunteers: 

Coaches — number  of  hours  per  team  per  season 
Other  volunteer  hours  per  team  per  season 

Direct  Costs  to  Gainesville  Recreation  Department 

Personnel  cost  per  game: 

Number      Pay  Rate  (Designate  per 

hour  or  game) 

Referee/umpire     

Scorekeeper     

Timekeeper   

Field  Supervisor   

Other 


Equipment  cost  per  team 
Uniform  cost  per  team 

Other  direct  (non-maintenance)  cost  per  team 
Maintenance  cost  per  game 


Utilities  cost  per  game 


APPENDIX  E 

DELPHI  QUESTIONNAIRE 

All  4  parts  of  the  questionnaire  (Part  1  -  Part  4)  were  used  for 
round  3.     For  round  2  only  Part  1  and  Part  2  were  used.     The  section 

for  feedback  information — "Quartiles  of  Responses" — was  used  on  round 
2  only. 


THE  DELPHI  TECHNIQUE 

"The  Delphi  technique,  developed  by  the  Rand  Corporation  over 
20  years  ago,  is  a  method  for  eliciting  and  refining  group  judgements. 
The  rationale  for  the  procedures  is  primarily  the  age-old  adage  'Two 
heads  are  better  than  one, '  when  the  issue  is  one  where  exact  know- 
ledge is  not  available.     The  procedures  have  three  features:  (1) 
Anonymous  response-opinions  of  members  of  the  group  are  obtained  by 
formal  questionnaire.     (2)     Iteration  and  controlled  feedback-iteration 
is  effected  by  a  systematic  exercise  conducted  in  several  iterations, 
with  carefully  controlled  feedback  between  rounds.     (3)  Statistical 
group  response-the  group  opinion  is  defined  as  an  appropriate  aggregate 
of  individual  opinions  on  the  final  round.     These  features  are  designed 
to  minimize  the  biasing  effects  of  dominant  individuals,  of  irrelevant 
communications,  and  of  group  pressure  toward  conformity." 

The  technique  has  received  extensive  testing  by  the  Rand  Cor- 
poration and  others  and  has  proven  superior  to  face-to-face  group 
discussion  and  to  individuals  acting  alone.     The  technique  has  been 
extensively  applied  by  both  industry  and  government.     It  has  been  used 
primarily  to  forecast  future  events,  define  and  rank  organizational 
goals,  and  measure  the  value  and  quality  of  services  provided  by  not- 
for-profit  organizations. 
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Dear 

This  is  the  first  in  a  series  of  questionnaires  designed  to 
obtain  the  opinions  of  the  members  of  the  Gainesville  Recreation  Ad- 
visory Board  as  to  the  importance  and  quality  of  recreation  programs 
provided  by  the  Gainesville  Recreation  Department.     In  each  of  the 
questionnaires,  you  will  be  asked  to  express  your  opinion  of  the  im- 
portance (Part  1)  and  quality  (Part  2)  of  recreation  programs.  In 
addition,  for  the  first  questionnaire  only  (enclosed),  you  are  asked 
to 

1.  Indicate  the  frequency  of  participation  by  you  and  the  mem- 
bers of  your  family  in  the  recreation  programs  (Part  3)  . 

2.  Indicate  your  opinion  of  the  adequacy  of  recreation  facilities 
(Part  4). 

The  instructions  for  completing  each  part  of  the  questionnaire 
are  given  at  the  beginning  of  that  part.     If  you  have  any  questions 
concerning  how  to  complete  the  questionnaire,  please  contact  me. 

The  opinions  of  each  board  member  will  be  combined  to  produce 
a  group  judgement  (group  response)  on  the  importance  and  quality  of 
each  program.     On  the  second  questionnaire,  you  will  be  provided  with 
the  group  response  for  each  program,  the  range  of  the  middle  50  per- 
cent of  the  individual  responses,  and  your  response  on  the  previous 
questionnaire.     In  light  of  this  feedback  information  you  will  be 
asked  to  reconsider  your  previous  response.     (The  columns  crossed  out 
in  the  first  questionnaire  are  for  the  feedback  information) . 

Since  anon3rmity  is  a  very  important  part  of  a  Delphi  exercise, 
please  do  not  discuss  the  questionnaire  with  other  board  members  or 
Recreation  Department  administrators  (who  are  also  participating  in 
a  separate  Delphi  exercise)  until  you  complete  the  final  questionnaire. 
I  am  interested  in  your  opinions  uninfluenced  by  other  board  members 
or  recreation  administrators. 

Because  your  opinions  play  an  important  and  vital  role  in  my 
study,  they  are  extremely  valuable.     Thank-you  very  much  for  your 
cooperation. 

If  you  have  any  questions,  please  contact  me  at  372-7279. 

Sincerely, 


Mark  Dunn 
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PART  1 
IMPORTANCE  OF  PROGRAMS 

We  want  to  find  out  your  opinion  of  the  importance  of  the  following 
programs  to  the  Gainesville  community.     We  are  interested  in  your  o- 
pinion  of  the  importance  of  the  programs  to  the  entire  community,  not 
just  to  you  and  your  family.     To  the  right  of  each  program  are  five 
(5)  boxes  which  represent  different  amounts  of  importance:    very  low, 
low,  average,  high  and  very  high.     Please  examine  each  program  and 
mark  an  X  in  the  box  which  best  describes  your  opinion.     Please  make 
sure  you  mark  one  (1)  box  for  each  program  listed. 


QUART ILES  OF 
IMPORTANCE  RESPONSES 


PROGRAMS 

1.     Youth  baseball 

VERY 
iOW 

LOW 

AVERAGE 

HIGH 

VERY 
HIGH 

PRIOR 
RESPONSE 

LOW 

MEDIAN 

HIGH 

2 .     Drama  workshop  & 

play  

3.     Park  &  picnic 

4.     Teen  nutrition 

5.     Golf  lessons 

6.     Pre  school 

7.     Recreation  center 

8.     Youth  football 

9.  Springboard 

10.  Tennis 

11.  Bowling 

12.  Tumbling 

13.     Modern  dance 

14.  Youth 

15.  Baton 
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QUARTILES  OF 
IMPORTANCE  RESPONSES 


16.  Tennis 

VERY 
LOW 

LOW 

AVERAGE 

HIGH  I 

VERY 

HIGH 

PRIOR 
RESPONSE 

LOW 

MEDIAN 

HIGH 

17.  Youth 

18.  Summer 

19.  Adult 

Softball  

20.     Senior  citizen 

21,  Racquetball 

22.  Sewing 

23.     Track  &  field 

24.  Girl's 

25.  Adult 

26.  Duplicate 

27.  Youth  swim 

28 .  Swim 

29.     Water  safety  & 

lif esaving  inst .... 

30.  Wrestling 

31.  Adult 

32.  Water 

33.  Archery 

34.     Supervised  playground 

35.     Recreation  center 

36.    Adult  flag 
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QUARTILES  OF 
IMPORTANCE  RESPONSES 


37.  Youth 

"VERY 
LOW 

"LOW 

"AVERAGE 

HIGH 

VERY 
HIGH 

'PRIOR 
RESPONSE 

"LOW 

MEDIAN 

1 — I 

o 

38 .     Women ' s 

39.     Square  dance 

AO.     Adult  exercise 

41.     Adult  arts  & 

42.  Competitive 

43.  Racquetball 

44.  Public 

45.     Camping  skills 

46.     Youth  arts  & 

47.  Cooking 

48.     Easter  egg 

49.  Cbeerleading 

50.     Skin  &  scuba 

51.  Tennis 

52.  Art 

53.  Recreation  center 

54.  Adult 

55.     Adult  swim 
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PART  2 
QUALITY  OF  PROGRAMS 

We  need  your  opinion  of  the  quality  of  the  following  programs 
(from  the  point  of  view  of  you  and  your  family) .     The  first  five  (5) 
boxes  to  the  right  of  each  program  represent  different  degrees  of 
quality:    very  poor,  poor,  fair,  good  and  very  good.     Please  examine 
each  program  carefully.     If  you  have  an  opinion  of  its  quality,  mark 
an  X  in  the  box  which  best  describes  your  opinion.     If  you  do  not 
know  the  quality  of  the  program,  mark  an  X  in  the  box  which  is  designated 
no  opinion.     Please  make  sure  you  mark  one  box  for  each  program  listed. 


QUARTILES  OF 
QUALITY  RESPONSES 


PROGRAIIS 

'VERY  •; 

POOR 

POOR 

> 
5C 

(T 
C 
C 
C 

•VERY 

GOOD 

•NO 

OPINION 

•PRIOR 
RESPONSE 

o 
s: 

• 

•MEDIAN 

HOIH. 

56.    Water  safety  & 

lif esaving  inst .... 

57.     Recreation  center 

58.  Racquetball 

jy .  loucn 

60.  Public 

61.     Skin  &  scuba  diving 

62.     Easter  egg 

63.     Recreation  center 

64.     Senior  citizen 

65.  Tennis 

66.  Youth 

67.  Tumbling 

68.  Competitive 

69.     Campign  skills 
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QUARTILES  OF 
QUALITY.   RESPONSES 


Tl  < 
O  P 
O  p: 
K 

>x 

<  c 

C 

:  ^ 

.  ■  ■ 

> 

M 

GOOD 

VERY 
GOOD 

NO 

OPINION 

PRIOR 
RESPONSE 

LOW 

MEDIAN 

"HIGH 

70. 

Youth 

71. 

Adult  swim 

72. 

Baton  lessons 

73. 

Youth 

74. 

Duplicate 

75. 

Youth  swim 

76. 

Youth  arts 

77. 

Wrestling 

78. 

Track  &  field 

79. 

Adult  arts 

80. 

Springboard 

diving  lessons.... 

81. 

Youth 

82. 

Summer 

83. 

Tennis 

84. 

Adult 

85. 

Pre-school 

86. 

Adult  flag 

87. 

Bowling  lessons 

88. 

Tennis 

89. 

Adult 

90. 

Archery 
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QUARTILES  OF 
QUALITY  RESPONSES 


VERY 
POOR 

POOR 

FAIR 

GOOD 

VERY 
GOOD 

NO 

OPINION  j 

PRIOR 
RESPONSE 

LOW 

MEDIAN 

HIGH 

91    Modern  dance 

92.     Recreation  center 

93.     Square  dance 

94.     Teen  nutrition 

95.  Sewing 

96.  Girl's 

Softball  

97.     Park  &  picnic 

98.  Water 

ballet  J 

99.     Drama  workshop 

100.  Adult 

ceramics  

101.     Women's  volleyball 

102.     Supervised  play- 
ground activities.. 

103.     Adult  exercise 

104.  Cooking 

105 .  Art 

106.  Golf 

107.  Adult 

108.  Cheerleading 

109.  Racquetball 

110.  Swim 
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PART  3 

PARTICIPATION  IN  PROGRAMS 


Indicate  the  frequency  of  participation  by  you  and  the  members 
of  your  family  in  the  following  programs.     Do  this  by  marking  an  X  in 
the  box  which  best  describes  you  and  your  family.     Please  make  sure 
you  mark  one  box  for  each  program  listed. 

FREQUENCY  OF 
PARTICIPATION 


PROGRAMS 

111.  Swim  meets 

112.  Youth 

ceramics  

113.  Youth 

baseball  

114 .  Archery 

activities  

115.  Skin  &  scuba  diving 

lessons  

116.  Youth  swim 

lessons  

117.  Easter  egg 

hunt  

118.  Tennis 

facilities  

119.  Golf 

lessons  

120.  Water 

ballet  

121.  Teen  nutrition 

lessons  

122.  Racquetball 

tournaments  

123.  Sewing 

lessons  

124.  Park  &  picnic 

facilities  

125.  Competitive 

swimming  

126.  Racquetball 

facilities  


T)  CO 

>  o 


s 

H 
O  M 
M  S 

> 
H 


>  ?3 
H  O 
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FREQUENCY  OF 
PARTICIPATION 


127.  Public 

PARTICIPATE 

• 

SOMETIMES 
PARTICIPATE 

FREQUENTLY 
PARTICIPATE  1 

128.     Square  dance 

129.     Youth  arts  & 

130.  Women's 

131.     Track  &  field 

132.    Adult  swim 

133.  Adult 

134.  Tumbling 

135.     Recreation  center 

136.  Tennis 

137.  Youth 

138.     Springboard  diving 

139.     Camping  skills 

140.     Modern  dance 

141.  Baton 

142.  Cooking 

143.     Adult  flag 

144 .  Tennis 

145.  Supervised  playground 

146.  Adult 
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FREQUENCY  OF 
PARTICIPATION 


147.     Pre  school 

NEVER 

PARTICIPATE.! 

SOMETIMES 
PARTICIPATE. 

FREQUENTLY 
PARTICIPATE. 

148.  Youth 

149.  Wrestling 

150.     Water  safety  & 

151.  Summer 

152.  Girl's 

153.    Adult  exercise 

154.  Cheerleading 

155.  Duplicate 

156.  Bowling 

157.  Adult 

Softball  

158.    Adult  arts  & 

159.  Art 

160.     Adult  basketball 

161.  Youth 

football  

162.     Recreation  center 

163.     Senior  citizen 

164.     Drama  workshop  & 

play  

165.     Recreation  center 
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PART  4 

ADEQUACY  OF  FACILITIES  AND  PROGRAMS 

If  you  believe  the  following  facilities  are  adequate  for  the 
present  needs  of  the  community,  mark  an  X  in  the  box  designated  ade- 
quate.    If  you  do  not  believe  they  are  adequate,  mark  an  X  in  the  box 
designated  inadequate.     Please  make  sure  you  mark  one  box  for  each 
facility  listed. 


ADEQUACY 


ADEQUAT 

INADEQU. 

FACILITIES 

3> 
H 

W 

166. 

Baseball 

167. 

Archery 

168. 

Softball 

169. 

Parks 

170. 

Recreation 

171. 

Racquetball  & 
handball. . .  . 

172. 

Playgrounds 

173. 

Swimming 

174. 

Tennis 

Please  list  any  additional  programs  or  facilities  which  you  feel 
are  presently  needed. 

1.   


2. 


3. 


4. 


APPENDIX  F 


COVER  LETTERS  FOR  ROUND  ONE 
DELPHI  QUESTIONNIARE 

FROM  DIRECTOR  OF  GRD  TO  CRD  SUPERVISORS 

FROM  DIRECTOR  OF  GRD  AND  CHAIRMAN  OF  PUBLIC 
RECREATION  ADVISORY  BOARD  TO  PRAB  MEMBERS 


May  28,  1975 


Dear 

As  you  know,  Mark  Dunn,  a  Ph.D.  candidate  in  accounting  at  the  University 
of  Florida,  is  conducting  a  study  of  recreation  programs  provided  by 
our  Department.     One  part  of  his  study  involves  judgements,  by  informed 
individuals,  of  the  importance  and  quality  of  recreation  programs. 
Your  opinions  of  the  importance  and  quality  of  our  programs  will  be 
solicited  by  a  series  of  questionnaires  (which  you  helped  pre-test). 
These  opinions  will  play  an  important  role  in  Mr.  Dunn's  study. 

Although  the  programs  presented  for  your  evaluation  are  believed  to 
be  representative  of  our  activities,  we  recognize  that  some  may  have 
been  omitted. 

Mr.  Dunn  will  personally  process  and  analyze  each  questionnaire.  In 
view  of  the  fact  that  Mr.  Dunn  has  to  analyze  the  information  from  the 
first  questionnaire  and  return  the  results  of  the  analysis  to  you  on 
the  second  questionnaire,  I  would  like  for  you  to  return  the  first 
questionnaire  (attached)  to  my  office  no  later  than  Friday,  June  6, 
1975.     If  you  have  any  questions  concerning  the  questionnaire  or  the 
overall  study,  please  contact  Mr.  Dunn  at  372-7279. 

Since  Mr.  Dunn  wants  your  o\m  opinions  uninfluenced  by  other  members 
of  the  Department,  please  do  not  discuss  the  questionnaire  or  your 
responses  with  any  other  member  of  this  Department  until  you  have  com- 
pleted the  final  questionnaire. 

I  believe  that  Mr.  Dunn's  study  will  provide  valuable  insight  into 
our  activities  and  how  they  can  be  improved.     Thank-you  for  your  co- 
operation. 

Sincerely, 


Albert  R.  Massey,  Director  of  Recreation 
GAINESVILLE  RECREATION  DEPARTMENT 
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May  23,  1975 

Dear 

As  you  know,  Mark  Dunn,  a  Ph.D.  candidate  in  accounting  at  the;  University 
of  Florida,  is  conducting  a  study  of  recreational  programs  provided 
by  the  Gainesville  Recreation  Department.     One  part  of  his  study  in- 
volves judgements,  by  informed  individuals,  of  the  importance  and  quality 
of  recreation  programs.     As  a  member  of  the  Gainesville  Recreation 
Advisory  Board,  you  are  considerably  more  aware  of  recreation  activities 
than  the  average  citizen.     Therefore,  your  opinions,  which  will  be  so- 
licited by  a  series  of  questionnaires,  will  play  an  important  role  in 
Mr.  Dunn's  study. 

Although  the  programs  presented  for  your  evaluation  are  believed  to 

be  representative  of  the  major  recreation  activities,  we  recognize  that 

some  may  have  been  omitted. 

Mr.  Dunn  will  personally  process  and  analyze  each  questionnaire.  In 
view  of  the  fact  that  Mr.  Dunn  has  to  analyze  the  information  from  the 
first  questionnaire  and  return  the  results  of  the  analysis  to  you  on 
the  second  questionnaire,  we  would  appreciate  your  returning  the  first 
questionnaire  (enclosed)  to  Mr.  Dunn  by  June  6,  1975.     A  stamped, 
salf-addressed  envelope  is  enclosed  for  this  purpose.     If  you  have 
any  questions  concerning  either  the  questionnaire  or  the  overall  study, 
please  contact  Mr.  Dunn  at  372-7279. 

Enclosed  please  find 

1.  A  description  of  the  Delphi  technique 

2.  Instructions  for  the  first  questionnaire 

3.  The  first  questionnaire 

We  believe  that  Mr.  Dunn's  study  will  provide  insight  into  the  Recreation 
Department's  activities  and  how  they  can  be  improved.     Therefore,  we 
encourage  and  thank-you  for  your  cooperation. 

Sincerely, 


Hal  Ingman,  Chairman 

GAINESVILLE  RECREATION  ADVISORY  BOARD 


Albert  R.  Massey,  Director  of  Recreation 
GAINESVILLE  RECREATION  DEPARTMENT 


APPENDIX  G 


COVER  LETTER  AND  INSTRUCTIONS  FOR 
ROUND  Tiv'O  DELPHI  QUESTIONNAIRE 


Dear 


Thank-you  very  much  for  your  cooperation  in  completing  the  first 
questionnaire.     In  this,  the  final  questionnaire,  you  are  asked  to  re- 
consider your  previous  responses  after  taking  into  account  sorae  feed- 
back information  of  the  responses  of  other  participants  on  the  first 
questionnaire.     Taking  this  information  into  account,  you  may  revise 
your  opinion  where  you  feel  it  is  appropriate.    Please  rate  all  pro- 
grams whether  you  change  your  opinion  or  not. 

The  feedback  information  is  listed  in  the  columns  designated 
"Quartiles  of  Responses."*    The  numbers  in  these  columns  correspond 
to  the  numbers  at  the  top  of  the  columns  used  for  rating  the  Quality 
and  Importance  of  programs.     Thus  for  "Importance,"  we  have 

Numerical  Rating 


Very  Low  1 

Low  2 

Average  3 

High  4 

Very  High  5 

and  for  "Quality,"  we  have 

Very  Poor  1 

Poor  2 

Fair  3 

Good  4 

Very  Good  5 

No  Opinion  6 


The  type  of  feedback  information  is  as  follows: 

1.  Your  own  response  on  the  first  questionnaire  is  provided 
in  the  column  labeled  "Prior  Response."    It  is  written  in  red  ink. 

2.  Measures  of  the  group  response  which  are  designated  "Low", 
"Median"  and  "High." 

The  "Median"  represents  the  middle  (central)  response  of  the 
group-50%  of  the  responses  are  less  than  or  equal  to  the  median  and 
50%  of  the  responses  are  greater  than  or  equal  to  the  median. 

The  number  in  the  "Low"  column  means  that  25%  of  the  responses 
are  less  than  or  equal  to  this  number;  the  number  in  the  "High"  column 
means  that  25%  of  the  responses  are  greater  than  or  equal  to  this 
number.     Thus  at  least  50%  of  the  group  responses  lie  in  the  interval 
designated  by  the  "Low"  and  "High"  numbers.     Responses  outside  of  the 
interval  represent,  in  a  statistical  sense,  extreme  scores.     Some  feed- 
back examples  follow 

*NOTE:     "No  opinion"  responses  have  been  excluded  from  the  feedback 
information. 
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Example 


Low     Median  High 


> 


b 


2 
4 


3 
5 


4 
5 


In  example  "a",  50Z  of  the  responses  are  less  than  or  equal  to  3  and 
50%  are  greater  than  or  equal  to  3.     At  least  50%  of  the  group  re- 
sponses are  contained  in  the  interval  2  to  4.     In  this  example  1  and 
5  would  represent  extreme  scores.     In  example  'b',  50%  of  the  re- 
sponses are  less  than  or  equal  to  5  and  50%  are  greater  than  or  equal 
to  5.    At  least  50%  of  the  group  responses  are  contained  in  the  interval 
4  to  5.    In  this  example  1,  2  and  3  represent  extreme  scores. 

If  you  have  any  questions  concerning  the  feedback  information 
or  the  questionnaire,  please  contact  me  at  372-7279. 

The  opinions  of  each  member  of  the  group  will  be  combined  to 
produce  the  final  group  judgement  on  the  importance  and  quality  of  each 
program.     Again,  since  anonymity  is  an  important  part  of  a  Delphi  ex- 
ercise, please  do  not  discuss  the  questionnaire  with  other  Advisory 
Board  members  or  Recreation  Department  Administrators. 

Please  return  the  questionnaire  as  soon  as  possible. 


Sincerely, 


Mark  Dunn 


APPENDIX  H 


RESIDUAL  PLOTS  AND 
REGRESSION  STATISTICS 
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Figure  13.     Plot  of  standardized  residuals  against 

standardized  predicted  dependent  variable 
for  45  programs . 
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Figure  14,     Plot  of  standardized  residuals  against  standardized 
predicted  dependent  variable  for  39  programs. 
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Figure  15.     Low  quality — plot  of  standardized  residuals 
against  standardized  predicted  dependent 
variable  for  6  programs. 
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Figure  16.  Low  quality  group— plot  of  standardized  residuals 
against  stantardized  predicted  dependent  variable 
for  5  programs. 
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Figure  17.     Average  quality  group — plot  of  standardized  residuals 
against  standardized  predicted  dependent  variable 
for  25  programs. 
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Figure  18.     Average  quality  group — plot  of  standardized 

residuals  against  standardized  predicted  dependent 
variable  for  24  programs. 
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Figure  19.     High  quality  group— plot  of  standardized  residuals 
against  standardized  predicted  dependent  variable 
for  13  programs. 
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Figure  20.     High  quality  group — plot  of  standardized  residuals 
against  standardized  predicted  dependent  variable 
.  for  9  programs. 
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Figure  21. 


Plot  of  standardized  residuals  for  quality  against 
the  predicted  standardized  values  for  quality. 
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